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I. SYNTAX-DIRECTED EDITING IN THE MODERN PROGRAMMING 


ENVIRONMENT 


A. PROGRAMMING ENVIRONMENTS 


Over the past ten years .or so, computer scientists have 
been devoting increasing attention to the notion of a 
“programming environment": the set of software and hardware 
tools available to aid the programmer in the performance of 
his task. In the past, the frogramming environment merely 
consisted of disjoint systems frograms that the programmer 
had to invoke deliberately and sequentially to input, trans- 
late, and execute his programs. Today, however, environ- 
ments are designed so that individual tools are oboth more 
useful and well-integrated as farts of a whole, with the 
overall result that program development is facilitated 
rather than hindered. 

As late as the 1970's, program development was an itera- 
tive and tedious process. Some of the tools in a typical 


programming environment were: 


keypunch nachine: used to enter program instructions on 


(usually) 80-column data cards; 


card reader: a Machine used to read the deck of data 


cards into the computer's nenmory; 


compiier: a program that translated high-level language 


programs into assembler language or internal machine 


language. Note that if translated into assembler 
language, an assembler program was also required for 
conversion into machine language -- in fact, this progran 


sometimes had to be provided by the programmer as part of 


his card deck; 


linkage editor: a program that linked object programs 
{the machine code produced by the assembler or compiler) 


and certain control informaticn before loading; 


loader: a program that loaded object modules and needed 


library routines for subsequent execution; 


line printer: a machine which usually produced the only 


visual output from the system described above. 


The tools described above formed a strictly "batch" 
environment. This environment was not significantly 
improved with the addition of time-Sharing, which basically 
involved the combination of input and output devices into a 
teletype-style terminal. However, time-sharing did give 
rise tc stored files of programs and data, primitive editing 
features to create these files, and new control words (such 
aeeeet RU NT) to combine several compilation-to-execution 
Peimitives. | 

Consider the process the frogrammer had to follow to 
develop a correct program using the tools described above. 
After designing an algorithm to solve his real-world problen 
and selecting a programming language, he usually drafted the 
program on paper and desk-checked its correctness by step- 
ping through the program one statement at a time. When he 
was satisfied that his program was correct, he keypunched it 
onto data cards and combined them with the necessary control 
cards to invoke the tools he desired. A typical card deck 


included such cards as: 


job card: to uniguely identify the program while in the 


compiler card: to invoke the compiler of the chosen 


a 
ee ee ee Se ? 


assembler card: to invoke the assembler (reguired if the 


compiler's output was assembly language and not machine 


code) ; 


the assembler program: if reguired, as a deck of cards; 


load card: to invoke the (usually system-provided) 
loader; 

object modules: program portions previously compiled for 
inclusion in this program (reusable subroutines, Lom 


example); 
data card: to tell the system that input data followed; 
input data cards; 


end card: to sSignify the end of the data (and the job) 


[Ref. 1: p. 200}. 


After preparing the card deck, the programmer fed the 
deck into the card reader and waited for his output, which 
was usually produced on the line printer. If the progran 
contained a compilation error, the programmer nad to deter- 
Mine the cause of error (usually with the help of a diag- 
nostic message of questionable utility), edit his program by 
typing new cards to replace the erroneous ones, and resubmit 
his deck through the card reader. Even if the program 
compiled successfully, it might have been aborted during 
execution because of a run-time error, again necessitating 
the error detection and correction procedures previously 
mentioned. A third error type that occurred was the logic 
error that compiled and executed but produced incorrect 
output. After checking the output and determining it was 
incorrect, the programmer again had to determine the cause 
of error (but this time without any diagnostic aid from the 


system) and repair the progran. 
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Even if the expense of computer time were not a factor 
{which it was in that era), one can easily understand the 
other reason for program drafting and desk-checking: to 
lessen the personal frustration of program correction and 
resubmission [Ref. 2: pe 445). Clearly the programming 
environment was not conducive to program development. Jia 
forced the programmer to concern himself with satisfying its 
requirements, avoiding aborted runs, and minimizing the 
number of job submissions, while his prime concern should 
have been the problem he was originally trying to solve. 

It is certainly true that the hardware technology of the 
time played a role in causing the unfriendliness of the 
programming environment. However, the arrival of faster, 
cheaper computers and interactive time-sharing with intelli- 
gent terminals in the 1970's did little more than replace 
the above process with the tedious cycle of invoking an 
editor, editing a program, saving the program, exiting the 
editor, invoking the compiler, debugging, and re-invoking 
the editor to effect changes. Only recently has attention 
been directed toward taking better advantage of modern capa- 
bilities to provide a useful, productive programming 
environment. 

Because programming envircnments are such a current 
research topic in computer science, there are many opinions 
as to what a modern environment should do for the user. ie 
1s safe to say, however, that it should do much more than 
Simply correct the obvious deficiencies of previous systems 
as discussed above. Sandewall [Ref. 3: pp. 35-36], for 
example, presented a list of some of the desirable functions 
of a programming environment, which included administration 
of program modules, test cases, and documentation; interdia- 
lect translation; compatibility checking between program 
segments; support for a particular development methodology 


(Such as top-down design); enhanced support of the 


11 


interactive session (to enable, for example, the programmer 
to back up through his commands and undo their effects).; and 
specialized editing based on the editor's knowledge of the 
syntax of the programming language. Winograd [Ref. 4: p. 
14] envisioned a futuristic environment as a “moderately 
Stupid assistant, to whom we give all the information we 
possibly can, and who in turn relieves us of much of the 
burden of memory, tedious checking, and drawing more-or-less 
straightforward conclusions." Based on the above, it is 
appropriate to Summarize Simply that a programming environ- 
ment should do everything possible to facilitate progran 
development. 

There are many "state-of-the-art" programming environ- 
ments in operation, each possessing somewhat different capa- 
bilities. Interlisp [Ref. 5], for example, provides an 
environment for the development of LISP programs. DUE ag 
interactive sessions, the user talks exclusively to the 
Interlisp systen. The program being developed is created, 
stored, and manipulated as a data structure by the system's 
structure editor, which displays the program in textual form 
for the user. A facility called "Masterscope" analyzes and 
cross-references the program to provide such information as 
whieh EUuUNncCtAoOnsS <Callewhaven: how and where variables are 
bound, and so on. Interlisp also includes a DWIM ("Do What 
I Mean") facility which, upon error detection, attempts to 
determine what the user intended and automatically make the 
necessary correction. 

Another example of a modern programming envircnment is 
the Cornell Programming Synthesizer for PL/CS, a subset of 
the PL/I language [{Ref. 6]. It allows creation and editing 
of programs tarough a syntax-directed editor, which uses 
templates based on the language's grammar to insure the 
syntactic correctness of the program. Like Interlisp, it 


stores. the program internally in a tree structure but 


liz 


displays it in textual forn. The Synthesizer aiso includes 
sophisticated debugging aids that permit tracing the flow of 
execution through the program at any user-selected rate. 
The user can step the program cne statement or construct at 
a time, and may command the system to display the value of 
particular variables as he does So. (This is an excellent 
example of an envircnment freeing the user from a tedious 
eerivity. Contrast such a feature with the outdated advice 
given in [Ref. 2: Dee oo | wine n states that to trace a 
program's progress, “the use of additional WRITE commands in 
strategic places is the most useful technique.") 

One characteristic of both eé€nvironments described above, 
and of modern environments is general, is the integration of 
individual tools. The progress of a particular tool is 
shared with the others, with the result that the system both 
eliminates duplication of effort and gains knowledge about 
the program being developed. For example, the syntax- 
directed editor of tne Cornell Program Synthesizer produces 
an executable derivation tree fcrm of the program during the 
editing session; using such a structure as an interface 
between tools, Subsequent comfilation or direct execution 
can begin without the re-parsing which would have been 
required had a conventional text editor been used. The 
Interlisp tools are often invoked from within each other by 
the user, allowing him to consider program segments fron 


different perspectives without losing his place in the 


program. Sharing of program knowledge among the tools thus 
can provide a more responsive overall environment. In the 
future, programming environments may even resemble 


Winograd's System A [Ref. 4], which comes to "understand" a 
program as it is being developed, forming its own comments 
and checking the program against the user's apparent 
intentions. 
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Be SYNTAX-DIRECTED EDITORS 


A SYntax-direered eadagton, as 1ts name suggests, is a 
tool used to edit pregrams based on the syntax of an under- 
lying programming language. Typically, it utilizes 
templates of language constructs inside of which the 
programmer enters such items as variable names, procedure 
calls, output strings, and sc on. The goal of such an 
editor is to free the programmer from concern over syntactic 
issues. A program constructed with a syntax-directed editor 
is assured to be free of syntax errors. 

One advantage of a syntax-directed editor 1s that 
overall development time may tre reduced by avoiding edit 
sessions whose sole purpose 1s to correct syntax errors for 
subsequent be-compilation. If the editor utilizes 
templates, two additional advantages may be realized. 
First, the user need not even learn the details of the 
language's syntax -- he merely has to know which template to 
install at a given point in the progran. Second, selection 
of tempiates with single keystrokes may reduce the time 
spent in the editing frocess itself when compared to typing 
the symbols in the constructs individually using a text 
editor. 

Syntax-directed editors typically represent the progran 
they are editing as a tree structure, and present a textual 
image to the user through the templates he has selected. If 
the internal representation of the program is in fact a 


structure suitable for subsequent interpretation or code 


generation, then the editor serves aS a parser as well. in 
terms of the overall environment, this allows the presence 
of a much simpler (and faster) compiler or interpreter 


needing no scanner, parser, or syntax error recovery mech- 
anism. Some limited facility for program translation fron 


one high-levei language to another may also be possible, if 
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Een thoanslation amounts to direct substitution of one set 
of templates for another. 

There are several examples of syntax-directed editors in 
widespread use today. One that most neariy matches the 
Mesceriptaon above 1S that found in the Cornell Progran 
Synthesizer [Ref. 6]. It uses templates for PL/CS grammat- 
ical constructs and creates prcegrams top-down by inserting 
templates and phrases into the existing templates. However, 
this editor is more than a Simple syntax-directed editor 
because it insures a degree of semantic correctness as well. 
For example, it identifies variables that are referenced but 
not declared. As part of the overall programming environ- 
ment, its product is directly executable by other tcols. 
Even when a program has not been fully created, it can be 
interpreted up to the point of incompletion. New code can 
then be entered, and interpretation can resume. 

Interlisp [Ref. 5], also mentioned above, has an editor 


that Danipulates a program thrcugh its syntactic structure 


rather than its textual form. Due to the syntactic 
eamplaicity of LISP, however, this editor does not use 
templates; Virtually any combination of atoms and tiists 


comprise a syntactically (1f not Ssementically) correct LISP 
progran. Originally, Interlisp's editor was designed for 
teletype-style interaction and had no full-screen capa- 
pr iity. More recently, a display-oriented editor, DED, has 
been included to enhance Inrterlisp's interactive nature 
[Ref. 7]. 

One variation on syntax-directed editing is to combine 
the qualities of syntax-directed editing with text editing. 
[Ref. 8] describes a family of such editors produced froma 
Hybrid Editor Generator, which receives as input a specifi- 
cation for a grammar and outputs an editor for that 
language. These Automatically Generated Editors allow the 


user to enter menu selections tc create program segments and 
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havigate within a tree, as in a syntax-directed editor, but 
they also allow the - user to enter text at any stage in the 
editing process. The text is parsed by the editor, and the 
tree produced by the parse is grafted onto the existing 
program tree. Another editor in this category is the "Zz" 
editor at Yale University [Ref. 9]. It possesses syntax- 
directed features such as automatic indentation, automatic 
balancing of expressions, user-directed selection of entire 
syntactic units, andan adjustable level of display detail. 
However, since it uses a text-oriented model of a prograa 
rather than a tree structure, it 1S more accurate to state 
that Z is atext editor capable of simulating many of the 
functions found in a syntax-directed editor. 

Whereas Z is a text editor that Simulates a syntax- 
directed editor, there also exist syntax-directed editors 
that panipulate text. EDS [Refs 109), for example, 1s an 
editor "primarily designed for manipulation of hierarchi- 
cally structured texts." It does so by superimposing a tree 
structure onto the text, analogous to a structured outiine 
or table of contents. The section to be edited or viewed is 
selected by navigating around the tree. Further discussion 
of structured "document editors" may be found in [Ref. 11] 
and [Ref. 12}. 

The MENTOR system {Ref. 13] includes an editor that can 
accurately be called syntax-directed because it edits 
programs Dy manipulating abstract syntax trees based on the 
grammars of programming languages. The system utilizes a 
tree manipulation language, MENTOL, which includes prime 
tives from which macros may be created to tailor the systen 
to edit programs ina particular programming ianguage. A 
viable set of Pascal macros currently exists. Note, 
however, that because the MENTOR system may be configured to 
handle any of a variety of languages, itis accurate 


described as "a processor designed to manipulate structured 
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Saece menetea lsc pe 129 ]) in that it can edit any information 
that can be structured under a format acceptable for input. 
The notion that syntax-directed editing may be applied to 
structures other than program trees is further discussed in 
[Ref. 14]. 


C. INTRODUCTION TO THE SDE AND OVERVIEW OF THIS PAPER 


With the above discussion of modern programming environ- 
ments and syntax-directed editing as background, this paper 
will discuss the writer's development of a tabie-driven 
syntax-directed editor capable of manipulating information 
structured under virtually any context-free grammar. Lhas 
editor, hereafter called the SDE, stores, retrieves, and 
edits tree structures based on the rules presented in an 
input grammar selected by the user. Interactive in natutfe, 
it is menu-driven and terminal-independent. As will be 
seen, its manner of tree manipulation also gives it limited 
language translation and other desirable properties. 

The SDE was based primarily on the work found in 
[Ref. 15]. It was programmed in Pascal as compiied by the 
Berkeley compiler, and is currently in operation within a 
Unix environment on a VAX 11/780 minicomputer. The reader 
is also invited to read [Ref. 16], which presents an 
in-depth discussion of table-driven syntax-directed editing 
and which served as research material both for this parer 
and for [Ref. 15]. 

Chapters Two, Three, and Four of this paper may be 
viewed as describing the SDE in progressively greater levels 
of detail. Chapter Two describes a sample session using the 
SDE and serves as an introducticn to its operation. Chapter 
Three discusses the conceptual ktasis of the SDE, including 
the algorithms it uses to display and store information. 


Finally, Chapter Four discusses detailed implementation of 
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the SDE to include data structures used, display implementa- 
tion, and data storage. 

Chapter Five assesses the accomplishments of the SDE 
both in theory and as a product. It describes the SDE'"s 
design decisions anad may serve as an "after-action report" 
On ther sper. Improvements and future development are also 
discussed. Chapter Five further contrasts syntax-directed 
editors with text editors and discusses the implications of 


syntax-directed editing in general. 
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II. A SAMPLE EDITING SESSION WITH 


A. GENERAL 


The SDE was designed for interactive sessions using a 
computer terminal. It displays edited programs in textual 
form on the screen, but creates, manipulates, and stores 
them as program trees. The SDE does not hide this represen- 
tation from the user. Rather, many of its commands are 
worded to guide the user through the tree he is editing, 
constantly reminding him that his real product is something 
other than its textual representation. 

The user interacts with the editor by typing the desired 
command, followed by a carriage return. He can also enter a 
series of commands (up to 80 characters in total length) 
followed by a single carriage return. Illegal commands are 
detected and reported when entered individually; when 
entered as part of a string of commands, they are reported 
and the rest of the string is ignored. 

{In the paragraphs that follow, terms such as 
"Control-A" or "AA" refer to the consecutive striking of the 


"control" and "a" keys on the keyboard.) 


B. INITIALIZING THE SDE 


Every session with the SDE commences with the SDE 


presenting a series of questions to the user as follows: 


GRAMMAR FILE: (Enter the name of the file containing the 
grammar the SDE will use to parse and display the 


progran.) 


PROGRAM FILE: (Enter the name of the file to be edited. 


If editing an already-existing file, the file must be in 


1.9 


readable format to the SDE -- which is assured if it was 
written by the SDE using the same grammar file in the last 


editing session.) 


FILE ALREADY EXLiSIS (1 Vege (This question must be 
answered to prevent the SDE from attempting to access a 
file that doesn't exist. It is a limitation of the Pascal 


implementation of the SDE.) 


The present implementation allows all of the above 
anformation to be provided on the “command line" that 
invokes the SDE. Thus the same initialization could fe 
achieved by typing SDE PASCAL DEMO.P Y, for example. 

(A third file, called "TEah", is also accessed by the 
SDE and must be present in the environment. This Lage 
provides information to the STE about the display screen 
being used. Details about this file are provided in 
Appendix F.) 

At this point in the session, the SDE has read the 
grammar and "TERM" files and has organized the information 
contained in them. If a pre-existent program file was indi- 
cated, this file has also been read and processed; the 
program as last edited appears cn the screen. If creating a 
hew program, a skeleton of a program (based on the selected 
grammar) appears. In either case, a menu of choices also 
appears at the bottom of the screen. The initial display of 
a new program using the "Minigcl" grammar in Appendix E is 
shown in Figure 2.1. 

Note in the figure that the current focus of attention 
(hereafter called the "current fosition" or CP) is indicated 
by underlining. Oh an actual display screen, the currene 
position is indicated by inverse video (if the terminal 
supports it) or by any distinguishing characters indicated 
In Chest IERU! opiate 
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begin <decl>* <statement>;... 
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Figure 2.1 Initial Display for New Minigol Progran 
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C. MOVING AROUND IN A PROGRAM 


The following paragraphs assume the current state of 
editing to be as depicted in Figure 2.2. (The grammar in 
use 1S, again, the "Minigol" grammar of Appendix E.) 

Figure 2.2 indicates that the Current Position is the 
™] < 10" portion of the "while" statement. Observe the 
commands the editor makes available to the user when at this 
CP. Setting aside the more general commands for now, one 
notes that the user may move the CP either to the right, to 
Peeeonild," or to a "parent." These movements make sense 
when one realizes that he is moving througa the program tree 
and not directly through the text. Thus, moving to the 
"parent" shifts the CP to the nede in the program tree whose 
Sees include the "i < 10" porticn of the program; moving to 
the right shifts the CP to its ktrother node to the immediate 
Figut in the tree; and moving to the "child" shifts the CP 
to the leftmost son of the (present) CP. 
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Figure 2.2 Sample State of Editing Using Minigol Gragmar 


(Actually, this is not entirely true. when the user 
selects a command to move the CP, the actual direction of 
movement is hidden frem hin. ‘However, the apparent move- 
ment, as seen on the display, is in the direction selected 
by the user. For a more thorough explanation cf this opera- 
tion, see Section 3.C.) 

As the user moves the CP abcut the program tree, the set 
of legal commands changes. For example, the command to move 
to a right brother is offered only if that brother exists. 
The entire set of movement commands (each being offered when 
applicable) includes "parent" (to move upward in the tree), 
"child" (to move downward in the tree), and “right =amd 
"Left" (to move to brother nodes on the same level in the 


tree). An additional command, "rest seg," applies only when 


the CP points to an element of a sequence, such as a 
sequence of individual declarations within a block. This 
command positions the CP tc reference all subsequent 


ere 


elements of the sequence, and is useful when attempting to 
delete or move the remaining elements. 

Returning to the situaticn depicted in Figure 2.2, 
suppose the user types "contrcl-s" to move the CP to the 


might . When the screen is updated, it will appear as shown 
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Figure 2.3 Editing State, CP = "begin-end" Block 


ie rigure 2.3. The menu portion is unchanged except that 
the command to move to the right has been removed (indi- 
cating there are no more brothers to the right) and the 
command to move to the left has been added (indicating the 
previous location of the CP). Ly tigre ontLol— Tat ahs 
point causes the strange display indicated in Figure 2.4. 
The CP 1s located at a node in the program tree that has no 
textual representation on the screen. ii tilts part rteula Lr 
instance, ‘the CP is the "declarations" portion of the 
"begin-end" block, which the user chose to close off ina 


previous editing session. A "nil" node remains in the tree, 


2: 


however, and can be accessed in the same manner aS any other 
node. Should the user later wish to insert declarations in 
this block, he can erase the "nii" node (using control-lL), 


at which time the SDE will prompt him for declarations. 
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Figure 2.4 Editing State, CP = Declarations 


D. EDITING A PROGRASA 


Moving around in a program tree may be considered a 
"passive" activity in that it has no effect on the tree 
itself. The following paragraphs discuss those editing 
commands which change the program tree -- in other words, 
the actual "editing" functions cf the SDE. 

The SDE 1s capakle of perfcrming the following editing 


functions: 


delete a portion of the program tree; 
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create a portion of the tree (by buiiding from a nonter- 


minal leaf in an incomplete program tree) ; 


move a portion of a program tree from one location to 


another; 
insert a subtree into a sequence of like subtrees. 


The above capabilities describe what is performed on the 
tree. Viewed in terms of their textual effects, these func- 


tions enable the user to: 


delete a user-defined name, a statement segment, an entire 


Statement, or a block of statements in a single command; 
add to the current [rogran; 

move text from one location tc another; 

insert an item into a seguence of like items. 


Note that on a more general level, the SDE allows the uSer 
EOeADD or DELETE information. It does not, however, ailow 
the user to directly CHANGE information (for example, 
through a "global replace" operation), ailithough this func- 
tion may be realized indirectly through a series of DELETE 
and ADD commands. (Reasons for this limitation of the SDE 
and ypossible corrective implementations are discussed in 
Chapter 5.) 

Deletion of a program seyment is simple. The user moves 
the CP until it references the entire portion to be deleted 
(as indicated by highlighting with inverse video on some 
terminals), then enters the control-L command ("erase"). 


That portion of the text 1s removed and replaced with the 


nane of the honterminal node type expected there. 
("<decl>*" is an example of such a node type. Its presence 
tells the user that this portion of the program is 


incomplete.) 
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Adding to the existing tree is dependent on one's loca- 
tion within the tree. First, such an operation is legai 
only when the CP references a nonterminal node as described 
in the previous paragraph. When the CP references a nonter- 
Minal on the screen, it is referencing a leaf on the (incon- 
plete) program tree to which sons must be added to complete 
the tree. Second, the nature of this specific nonternminal 
dictates the choice of possible sons from which the user may 
select. Referring again to Figure 2.1, the menu includes 
commands to select an "integer" or "real" declaration. 
These options would not have teen offered if the CP were 
referencing the "statement" nonterminal instead of the 
"decl" nonterminal. 

Assuming the user selected command "n" (for "“integer") 


te Co Acad ee e2 atl the display would then resemble that of 
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begin 
integer <id> <decl>* <statement>;... 
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F2qune 2.45 Program Display While Creating Declarations 
FIGUEC 2... Note that the word "integer" has been added to 


tne display, the CP references anew "var" nonterminal, and 


the menu choices reflect the new CP. The SDE's automatic 


ZO 


movement of the CP is a feature optimized to permit the user 
to create his entire program from top to bottom (of text) 
Without having to move the CP himself. 


Moving a tree segment from cne location to another is a 


two-step process. First, the user must "grab" the desired 
portion of the tree or text. This is done by positioning 
the CP on the entire porticn desired, then entering 


Gontrol-H for "grab" along with a digit from 0 to 9. (Note, 
then, that the SDE can maintain up to ten "grabbed" segnents 
at a time.) No change occurs on the display, because grab- 
bing a program segment does net delete its present occur- 
rence. The "grab" function is therefore a "copy" function 
which allows duplication of program segments. To delete the 
original occurrence, the "erase" command discussed above may 
be applied aiter the segment has been grabbed. 

The second step in moving a segment is to place it in 
its new location. This new location must be a norterminal 
leaf as described above. Further, the nonterminal must be 
compatible with the root of the program segment tc be 
attached. Thus, one can not attach a seguence of declara- 
tions where a sequence of statements is expected, nor can he 
even attach it where a single declaration (not a sequence) 
is expected. The user attaches a grabbed program segment by 
entering the "put" command (control-X) along with the digit 
(0 through 9) referencing the grabbed segment. It the 
segment is not compatible with the CP, an error message will 
be displayed and the graft will not take place. 

The final editing capabiiity or the SDE, inserting, is 
accomplished through the contrcl-[ key, which invokes the 
“insert before" command. This command is offered only when 


the CP references an item in a sequence of items in the 


tree. A sequence is defined in a grammar by the "*", "4u, 
or "..." property of anonterminal as displayed on the 


screen. Thus "<decl>*" and "<statement>;..." both indicate 
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sequences. Entering control-{ allows the user to entepean 
item into a sequence textually in front of the item refer- 
enced by the CP. For example, Figure 2.6 shows the display 


after the user selected control-[ when the CP had indicated 








begin ; 
integer i <decl> 
integer j 
te) 
while 1 < 10 do 
begin | ; 
J2= 2 * 1; 
s=1+ 1 
end 
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Figure 2.6 Display After Selecting "Insrt Befr" 


"integer j". He can now enter a single declaration to be 
inserted as indicated, using the normal creation commands in 
the menu. 


Ee. TERMINATING AN EDITING SESSION 


The user terminates an editing session by entering 


control-E. The SDE then asks him two guestions: 
SAVE PROGRAM IN PARSED FORM (Y¥ OR N)? 


SAVE TEXT FORM (Y OR N)? 
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Hiewiinst guiestion above corresyonds to the "save" or "quit" 
command found in text editors, for it enables the user 
either to save the edited tree oor discard it. If the user 
indicates he wishes to save the parsed form, it is saved 
under the same file name entered during the initialization 
process; thus the previous version of the program, if any, 
is lost. If the user instructs the SDE not to save the 
parsed form, the previous versicn remains intact. 

The second question relates to the text form of the 
created program. At the user's response to this question, 
the text form of the program may be saved ina file to be 
hamead by the user. Note that this file is textual, andis 
not suitable for input to the SDE at a later date. However, 
it is useful in that it can be retained as a text file for 
archival or inspection purposes. Furtoer, if complete, it 
represents a syntactically correct program ready for input 


to a conventional parser or interpreter. 


Fe ADDITIONAL FEATURES OF THE SDE 


While the above capabilities represent a functional 
syntax-directed editor, the SDE contains several additional 
features to make it more interactive and responsive to the 
user's needs. For example, the "display toggle" disables 
the display of the menu, allowing the user to view more of 
his program on the screen. A second entering of the command 
("7") will restore the menu. 

A more Significant display feature is the combination 
"change focus" and "change derth." The "focus node" is 
defined as that node in the tree at which screen display 
begins. When viewing the entire program, the focus node is 
thus the root of the tree. Note that the focus node is 
always the root of a subtree, and only that subtree will be 


displayed. The "depth" value indicates how many generations 
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of descendants from the focus node are to be displayed. 
Descendants below the depth iimit will be displayed by an 
ellipsis ("..."). The combination of the focus node and the 
depth limit allows the user to see a detailed view of one 
portion of his program, or to see an abbreviated view of his 


entire program, oon the display. Figure 2.7 represents a 


13 = 60 
while i< 10 do 
begin . 
J2= 2 en; 
1:= 1 + 1 
end 
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Figure 2.7 Program Display, Focus = Statements 


display of the program listed in Figure 2.2 when the focus 
is adjusted to view only the "statements" portion of the 
program; the depth limit is set sufficiently high to permit 
viewing of all aspects of the statements. Figure 2.8, om 
the other hand, represents a broader perspective of the same 
progran. In this case, the fccus node is the root of the 
tree, and the depth limit has been set low. 

A focus node is selected by positioning the CP over the 
desired program segment to be viewed, then entering 


controi-A for the "chg focus" command Note that such a 
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Figure 2.8 Program Display at Low Depth Setting 


process can serve only to bring the focus "closer" to the 
lowest program level. Elevating the focus toa higher 
perspective 1S accomplished through the control-P ("parent") 
command, which automatically raises the focus as necessary 
whenever the CP 1s moved upward in the tree. The depth 
Mmlit 12s set by selecting control-B for "chg depth," 
followed by a positive integer. 

The final feature to be discussed is the "move toggle." 
AS mentioned previously, the SDE's automatic movement 
feature is optimized to permit top to bottom entering of 
text without manually moving the CP. This feature, however, 
tends to act against the user when editing a pre-established 
program portion. To inhibit this feature, capital M can be 
entered. All subsequent movement of the CP must be directed 
by the user through the movement commands discussed in 
Section 2.C. } 
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Tit. THE CONCEPTUAL BAS 


3 OF THE SDE 


A. PARSING AND TRANSLATION ON A CONTEXT-FREE GRAMMAR 


The textual form of a ccmputer program is written 
according to the rules of the programming language's 
gcammar, which for mcst programming languages is classified 
as being context-free. (Current programming languages often 
include features that make them more complicated than 
context-iree. Two examples of such features are the 
reguirement for variables to tke deciared before they are 
used and the requirement that procedures be declared anda 
invoked with the same humber cf parameters [Ref. 17: De 
140 j. Languages with such features, however, are staim 
considered and treated as context-free, with their special 
cases handled as exceptions on individual bases [Ref. 18: p. 
ZO. ) Formally, a context-free grammar is a four-tuple 
(N, E, P, S), where Nis a nenterminal alphabet, E isa 
terminal alphabet, E and WN are disjoint, S is the) "Seam 
symbol" and an element of N, and Pisa set of productions 


of the form A --> x such that in each production, 
A is an element of N; 


x is a string formed by combining any finite number 


(including zero) of elements from N and E. 


The set of terminal strings that can be formed by 
applying the rules of the grammar is, in effect, the set of 
programs that can be written using that grammar. A parser 
is a program that, given a string of terminals, determines 
if it is a legal program from the lanyuage's grammar. The 
parser does this by finding the derivation (the sequence of 


applications of the grammar's rules) that would produce the 
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terminal string. This derivaticn may be represented {and is 
usually perceived) as a "derivation tree" whose root is the 
nonterminal symbol S and whose leaves are the terminals that 
Make uf the progran. For example, consider the following 


context-free grammar: 


Mee= {A, Ty FF}; 


Peat, ety (ys je Ge Cs S} 5 
S = A; 
P = the set of productions 
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a> Ss 
The derivation tree of the legal progran "q * (r + sj)" is 


S7Ovn in Figure 3.1. 

While the parser's function is, by @derinition, the 
determination of the syntactic correctness of a program, the 
derivation tree it creates is also an important product in 
itself. In order for a computer to execute the original 
program, the program must be translated into a more suitable 
form known aS intermediate code, which can either be inter- 
preted directly or optimized and translated again into 


machine-executable code (compiled). Program translation is 
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Figure 3.1 Derivation Tree of "gq * (r + s)® 


accomplished through a "Syntax-Directed Translation Scheme," 
or SDITS [Ref. 18: p. 279], which conceptually tlansiormoageee 


derivation tree into a “translation tree soe 
1) removing the terminal nodes; 


2) permuting the children of each interior node accordums 


to a particular translation rile. 


3) adding new terminal nodes, members of a new terminal 
set. 
Formally, an 5) BES is aqefined as a five-tuyle 


(N, E, D, R, S), where N, E, and S are the same as above, D 
is the terminal alpnabet of the translation, and R is a set 


of productions A --> x,y such that in each product iem 


A is an element of N; 
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x 1s a string of terminals from E and nonterminals fron N 


(as above) ; 


y is a string of terminals frem D and nonterminals fron N; 


there is a one-to-one association of nonterminals in x and 
Wie 


Note that by following an SDTS, two trees may be 
constructed. The first is the derivation tree, produced 
Pcome the "A —--> x" portion of the productions. The second 
tree may be created from the "A --> y" portion of the 
productions. This second tree is the translation tree, and 
constructing it in parallel with the derivation tree accon- 
plishes the three conceptual transformations listed above 
[Ref. 18: p. 296]. In fact, it 1s the translation tree, not 
the derivation tree, which is the desired by-product of the 
parser, for it is a representation of the program's interme- 
diate form. 

As an example of program translation, consider the 
following SDTS, which is an extension of the context-free 


grammar described earlier: 


N PAgel, 0 he 
Mueerits “se (se )e Ge Le S};3 


Pees {ADD, MPY, gg, I, Ss}; 


R = the set of productions 
A--> A+ TT, <A T ADD 
aa > Ti, T 


ienaa-> cee F, Tt F MPY 


35 


F --> (A), A 


Te Farge ah, g 

== ae c 

F -~> Ss, S 
The translation tree of the prcgram "gq * (r + Ss)" 1s Shown 
beside the program's derivation tree in Figure 3.2. Using 


this SDTS, the translation cf the original program is 
"gq r s ADD MPY" (which is the same expression in postfix, or 


postfix Polish, notation). 
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Figure 3.2 Derivation, Translation Trees for "g * (r + s)" 


One should note that parsers seldom actually construct 
the derivation or translation tree as conceptualized =apoues 
Some more efficient representation, often involving a stack, 


is frequently used instead [Ref. 18: p. 46]. 
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B. THE SDE AS PARSER AND TRANSIATOR 


As stated above, two major functions of a parser are to 
determine syntactic correctness and translate the source 
program into intermediate code. The SDE also performs these 
same functions, although ina different manner. Syntactic 
correctness iS assured because the editor only creates 
correct programs. Translation is accomplished by dynami- 
cally creating the translaticn tree during the editing 
session. 

The SDE creates and edits programs based on an infut 
grammar of the user's choice. The grammar file represents 
productions in a manner consistent with the above discus- 
Sion: each production is of the form A --> x,y where x and y 
are as described above. The SDE, however, uses the elements 
of x and y in a manner different from that of the SDTS. In 
the SDE, the "x" portion of a production (hereafter referred 
to as the “analysis part") determines the textual form of 
that production in the prograz as displayed to the user 
during the interactive session. The analysis part of a rule 
acts as a template which displays the terminals in a ruie 
and treats nonterminals as "hcles" to be filled in using 
other rules. The "y" portion cf a rule (hereafter referred 
to as the "synthesis part") determines what will be added to 
the tree being created by the SLE when that rule is selected 
by the user during the editing frocess. 

Intuitively, the SDE only creates syntactically correct 
programs because the terminals are written by the editor, 
not by the programmeLr/user. Whereas a conventional parser 
useS grammar rules to determine whether a given input string 
of terminals is correct, the SDE uses the same grammar rules 
momecreate the correct input string. For example, based on 
the sample grammar above, the string "g * r + s)" is illegal 


because of unbalanced parentheses. A programmer could 
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erroneously enter such an expression using a text editor, 
and a parser would detect the error. Ii the programmer had 
used the SDE to create the program, however, the parentheses 
would have been balanced automatically when he selected the 
rules in the grammar to produce the string he wanted. Note 
that neither of the parentheses, nor the "*" or %+" opera- 
tors, would have actually been typed by the programmer at 
aie The SDE would have displayed these terminals as parts 
of the templates selected by the user. 

When a user creates a program using the SDE, he selects 
rules to be applied to replace the nonterminal leaves 
currently in his unfinished tree. As he selects each rule, 
he instructs the SDE to build onto the translation tree 
according to that rule's synthesis part. What he sees on 
his display, however, is determined by the analysis part of 
the rule. Thus, the programmer dynamically creates the 
translation tree during the editing session, but need only 
concern himself with the textual form on the display. Any 
translation that takes place is hidden from the user. 

Actually, the claim that the SDE creates a translation 
tree is not entirely accurate for several reasons. Pips 
an SDTS allows the translation of terminals from E (the 
source language alphabet) into other terminals from D (the 
translation alphabet). While ir the example given above the 
user-defined names q, r, and s translated into themselves, 
the technical definition of an SDTS does not reyguire this, 
so they could have peen translated into any terminals in D. 
The SDE, however, stores the actual user-defined names fron 
Ein the tree it creates; no translation is performed on 
then. (The SDE is capable of performing such translation 
but its implementation is inefficient and its format proves 
exceptionally tedious to the grammar writer.) 

Another reason it is inaccurate to claim the SDE creates 


a translation tree is because the SDE is in fact a general 
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purpose structure editor and not simply a program editor. 
The term “translation tree" implies that the tree contains 
information to be used further in a compilation or interpre- 
tation process. While the SDE can certainly be used to forn 
such a tree, it is not limited to these applications. The 
purpose and structure of the tree produced by tne SDE depend 
on the intention of the input grammar designer -- there 
might be no "translation" involved. (Chapter 5 provides a 
thorough discussion of the range of applications of the 
SDE.) 

Finally, the translation ability of the SDE is somewhat 
limited. {Ref. 17] states that tne intermediate code 
produced from a practical SDTS is usually classified into 
one of four categories: postfix, abstract syntax tree, guad- 
ruple, or triple notation. A simple example of postfix (or 
postfix Polish) notation has already been provided. The SDE 
can provide such a translation -- Appendix D, for example, 
lists an SDE input grammar fcepresenting the SDTS used 
earlier in this chapter. Abstract syntax trees are sinpli- 
fied derivation trees in which the interior nodes are opera- 
tors and the leaves are operands. The SDE generally can not 
create atree simplified to this extent because certain 
nodes having no semantic value need to be retained for the 
display information in their analysis parts. (An inter— 
preter or code generator using such an intermediate form 
would have to be tolerant of these useless nodes.) Triple 
and quadruple notation are discussed in [Ref. 17]. The 
possibility of translation to these forms was not explored 
in preparing this paper. 

Based on the above considerations, 1t is more accurate 
to say that the SDE creates a tree which may possibly 
achieve a translation of the source program into an interme- 
diate form useful to an interpreter or code generator. it 
is always true, however, that the SDE insures syntactically 
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correct program creation. The textual representation of the 
tree it creates is therefore acceptabie as error-free infut 
to a conventional parser, so the SDE is a useful editor 
regardless of the external value of the tree itself. 

It should be noted that the SDE‘s transiation facimiiaae 
however limited, represents a major difference between the 
SDE and editors such as that described in {Ref. 16]. These 
editors create a representation of a program's derivation 
tree, not its translation tree. It may informally be argued 
that the SDE is at least as powerful as such editors, for 
derivation trees result from using grammars whose synthesis 
parts simply reflect the nonterminals in the corresponding 
analysis parts. 

Finally, note that the use of the term "translation" in 
this chapter refers to translation from a high-level 
language toan intermediate fcrn. Translation from one 
high-level language to another is also possible using the 
SDE, and this type of translation will be discussed in 
Chapter ys. 


C. A CLOSER LOOK AT INPUT GRAMMARS 


AS mentioned above, an input grammar to the SDE isa 
series of rules of the form A —> x,y where x is the anal- 
ysis part and y is the synthesis part of the rule. The Soe 


accepts these rules and organizes them into a list of 


records, each of whose menbers contains the foliowing 
fields: 
hame; the name of the rule (and the nonterminal being 


replaced) ; 


ana lyvislS pare: an ordered list of the terminals and 


nonterminals to be displayed; 
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Svlenests pakt; the name of the node to create in the 
tree, with an ordered listing of the sons of that node. A 
node may have both nonterminal sons and termfinai sons. 
(Note, however, that terminal sons represent terminals of 


the transiation, not terminals in the textual progran); 


nonterminal dictionary: an index that relates the nonter- 


Minals in the analysis part to the nonterminal sons in the 


Synthesis part. 


The above components may best be understood through an 
example. Consider the rule "A --> A+ T" from the sample 
grammar in Section 3.A, but assume its translation is to be 
"T A ADD" (which represents a reversal of the order of the 
operands). This rule must be redesigned for input to the 
SDE as follows. Note the use of quotes to mark terminals 
and parentheses to mark nonterminals in the analysis’ and 


synthesis parts: 
name: A 
mmeysis part; (A) “+ " (7) 
Synthesis part: 
node to be created: A 


sons of node are: 


paths opndZ expected node: (T) 
Zaepe thse opnod 1 expected node: (A) 
Bye path: OprEtr expected node: "ADD" 


The SDE will accept this rule as input, adding it to its 
list of grammar rules after creating for it a nonterninal 


meet Lonary Containing the’ infor tation: 


1) path: opndt expected node: A 


4 4 


2) eopatnes Opadz expected nede: T 


The name and analysis porticns of the above rule are 
straightforward. The synthesis part, when invoked, causes a 
nonterminal node named "A" to be created in the tree. This 
node will be given paths to three sons, andthe paths will 
be given unique names to distinguish them from each other. 
The relationship between each path name andthe expected 
honterminal node at the end of the path is recorded in the 
nonterminal dictionary. Paths to terminals are omitted from 
the dietionary. 

The synthesis part of the above rule deviated from the 
translation in Section 3.A to demonstrate that the order of 
nonterminals in the analysis part need not be duplicated in 
the synthesis part. Note that the nonterminal dictionary 
entries preserve the order of the nonterminals as they 
appear in the analysis part. Thus, a seguential access of 
nonterminal dictionary entries will provide access’ to the 
nonterminal sons in analysis part order, which is therefore 
independent of the order in which they are logicaily stored 
in the tree. 

(Note that input grammar rules for the editor in 
[Ref. 16] need only contain analysis parts. This editen 
generates synthesis parts based on an examination of the 
analysis parts, which 1s possible because, as mentioned 
previously, the editor creates a derivation tree, not a 
translation tree.) 


The SDE permits a grammar rule whose left side nonter- 


Minal also appears on the right side of the rule. In otter 
words, recursive productions of the form A --> ABC are 
permitted. (Grammars including such rules, however, must 


provide alternative productions to apply to end the recur- 
sion. This is analagous to the requirement that a context- 


free grammar produce only finite-length programs.) Further, 
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a production rule may produce mcre than one son of the same 
node type. Thus, peoduGetonseoO: the form A --> ABBC are 
also permitted. In such a case, it is the responsibility of 
the nenterminal dictionary to distinguish between similar 
nodes in a rule and provide the correct path to whichever 
node is requested. 

The notation provided thus far, and the two capabilities 
listed above, are sufficient tc show that the SDE supforts 
the set of all context-free grammars less those that contain 
"e-productions," or productions of the form "A --> e" where 
"e" represents the null symbol (or put another way, produc- 
tions whose right sides contain uo terminals and no nonter- 
minals). [Ref. 19] states, however, that a context-free 
gcammar with e-productions is eguivalent to one without such 
productions, except in the case where the null string is a 
member of the set of strings derivable from the grammar. 
Thus, the SDE supports any context-free language that does 
not include the "empty progran." 

The SDE input grammar convention as described above, 
therefore, is sufficient to handle all useful context-free 
languages. For ease of grammar design, however, the SDE 


also supports several common grammatical conventions: 


the Kleene "*", meaning zero or more occurrences ofa 


~ 


mieeclilpsis ("..."), meaning one or more occurrences of a 


node separated by a delimiter; 


Ene option ({"?"), meaning zero or one occurrence of a 


node. 


The SDE also allows an abbreviated format for collecting 


productions with common left sides. This convention results 
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in a second rule type, the “alternation” [ Ref. 15:5 po e2gee 


with the following structure: 


ID 


ame: same as above; 


in 


eries of: 


Ghouce. 7d: the command the user will input to identify 


this selection; 
analysis part: same as above; 
synthesis part: same as abcve; 


nenterminal dictionary: Same as above. Note that, as 


above, the dictionary is generated by the SDE, not 


provided by the grammar; 


agisplay: what the SDE will display in the menu to 


G€eseripesenmes, choice. 


The above is more than a Simplifying convention. It allows 
the grammar designer to compcse his own "display" and 
"Choice id" fields. It also has important implications to 
be discussed in the following section. 

It should be noted taoat the SDE reguires a strict format 
for its input grammars which has been avoided in the above 
discussion. A thorough description of grammar input and 


Storage is included in Chapter 4 and Appendix B. 


D. TREE CREATION, DISPLAY, AND NAVIGATION 


AS mentioned previously, the user manipulates a tree 


structure created by the SDE but views the creation in 


textual forn. In this section are presented the two algo- 
rithms that correspond to tree manipulation and tree 
display. Tree display will be discussed first, since a 
introduces concepts needed to understand the tree 
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Manipulation routine. Also discussed in this section is how 
the user moves about in the tree. 

To understand tne display of the tree, ie wiomec lL rst 
necessary to visualize the tree itself. Eadcieanode 7oOr the 


tree is a record structure with the following fields: 
Mame: the name of that node type; 


syntax: a reference to the grammar rule that produced the 


node; 
parent: a pointer to the parent of that node in the tree; 


enald list: an ordered list whose members point to the 


sons of the node in the _ tree. Each such pointer is 


uniguely identified by its "path" attribute. 


(The above description of the childliist is the first glimpse 
of the SDE's actual implementation presented in this paper. 
Whereas the nodes ofa tree are usually pictured as having 
direct pointers to a possibly variant number of sons, the 
Pascal implementation of the SDE necessitates an expandable 
linked list of pointers. This implementation detail will 
remain exposed throughout the following discussion to avoid 
confusion in Chapter 4, when the full implementation is 
presented.) 


The above record structure is used for all the nodes of 


a tree, although these nodes may fall into one of three 
categories: 
1) nonterminals: by definition, they have syntax refer- 


ences and sons in the tree; 


2) terminals of the translation (such as "ADD"): these 
nodes have no sons in the tree, nor do their syntax parts 


ceference anything; 
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3) terminals from the source language (representing user- 
defined names): these nodes are similar to terminals of 
the translation, but they ccntain user-provided terminal 


Symbols instead of grammar-prescribed names. 


The tree, therefore, is a structure whose interior nodes are 
nonterminals and whose leaves are either terminals of the 
translation or user-provided terminal names. 

The displaying of the textual form of a program based on 
its parsed tree form is known as "“unparsing," and may be 
represented in a recursive algorithm as follows. Assume the 
existence of a tree as described above. To unparse such a 


tree, begin at the root node and follow the following steps: 


1) If the node is a terminal of the translation, take no 


action and return from the recursion; 


2) If the node is a user-provided terminal, display the 


terminal and return from the recursion; 


3) If the node is a nonterminal, use its "syntax" refer- 
ence to access the rule that generated the node. Access 
that rule's analysis part and nonterminal dict?ongew 


Consider each item of the analysis part in order: 
a) If the item is a terminal, display it; 


dD) If the item is a nonterminal, look it up in the 
honterminal dictionary to get a path to the appropriate 
son in the tree. If there is a node at the end of this 
path, go to it and unparse it recursively. If there is 
no node here, display the name of the expected nonter- 
Minal on the screen -- this indicates a program not 


fully created. 


The above unparsing algorithm is only a brief summary of 


what the SDE performs. Details concerning the grammatical 
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abbreviations ("*", "+", and sc on) have been omitted, as 
have display considerations such as indenting, depth 
handiing, and so on. A more thorough explanation is 
provided in Chapter 4. 

The conversion from tree to display has been discussed. 
What remains is to discover how the user dynamically creates 
the tree, a process which shall be referred to as "parsing" 
Since this is what a conventional parser would do givena 
Pext input. 

There are two general categories of user input to the 
SUG. One category is the set of lanyguage-independent or 
"standard" commands which the user may invoke either to nove 
about in the tree or to adjust a part of the tree already 
created. They are standard inthe sense that (as a Set) 
they are legal at virtually any phase of program development 
or position within the tree. Examples of standard commands 
are Move Right, Delete, Grab, and so on. 

The second category of input is the set of language- 
dependent, "special" editing ccmmandS which cause creation 
of new nodes in the tree. They are speciai in that their 
appropriateness is strictly dependent on one's location in 
the existing tree as related to the input grammar. FOr 
example, commands to create an assignment statement are not 
valid when positioned in a "declarations part" of the tree. 

While standard ccmmands have been defined as beirg legal 
at any location in the tree, note that certain members of 
this set will be invalid on certain occasions. For example, 
it is illegal to Move Right to the next son of a node if it 
has no more sons there. Similarly, it is illegal to access 
the parent of the root node or descend to the son of a leaf. 
Such exceptions, however, have nothing to do with the input 
grammar being used -- they are purely functions of one's 


iocation in the tree. 
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The key to program creaticn lies in the algorithm to 
effect the special ccmmands, and to understand this algo- 
rithm it is first necessary tc introduce the mechanism by 
which the SDE identifies its "place" in the tree. The 
current location is determined through a pair of values 
called "CN" (current node) and "CP" ({(currenteeatinie The CP 
always references a fath from the CN to one of its nonter- 
minal sons. Viewed from the user's perspective as presented 
in Chapter 2, the CN always references the parent of the 
node of concern -- which is at the end of the CP. (This is 
why "Cp" was defined as "Current Position" in Chapter 2. 
The SDE highligats whatever is pointed to by the Current 
Path, through inverse video or some other terninal-dependent 
feature. Since this is the only visual indication to the 
user of his place in the program, the "Current Path" nay 
rightly be calied the “Current Fosition" as well.) 

The CN will always reference a node currently in the 


tree, while the path indicated Ly the CP may or may not have 


anode at its’ end. Note that speciai commands are only 
valid if the CP's path has no node at its end -- in other 
words, something can be added but not overwritten. 


Recalling that the synthesis part of a grammar rule names a 
node to be created and an ordered list of uniguely identifi- 
able paths to sons, the kernel of the SDE'S creation ailgae 


rithm is as follows: 


1) Access the grammar rule that created the CN node 


through its "syntax" attribute: 


2) Search the nonterminal dictionary of this rule to find 


the name of the nonterminal associated with the CP. 
3) lLIook up the grammar rule fer this nonterninal. 


4) Using the synthesis part of the rule just found, create 


a nhew node at the end of the CP's path in the tree. Also 
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to be created is the node's childlist, which will include 
the identities of the paths indicated in the synthesis 
wide ts Further, if any son is identified in the synthesis 
part as being a terminal, this son is also created; it 


will have no syntax part or childlist values. 


Note that the above algorithm makes no use of the user's 
am put . In fact, if a grammar having only one possible 
production from each nonterminal were input, user interac- 
tion would be totally unnecessary -- but the grammar could 
only create one program. Any useful grammar involves selec- 
tions from alternative producticns to be applied at partic- 
ular locations in the tree. In the sample grammar in 
Section 3.A, for instance, AO eacdtlOnmueor thew [ —-> F 
production results ina different expression than if the 
T--> T * F production were selected. Note that whereas a 
conventional parser selects the appropriate production based 
on the input text, the SDE must allow the user to direct the 
selection of a production as the program is being created. 
This is accomplished through the alternation rules in the 
grammar. The alternation is the oniy rule that relies on 
Bmemuser’S input at ali -—— all productions with pre- 
determined synthesis actions can be performed automatically. 
This fact can be utilized to fcrm a new creation algorithn 


as follows: 
1) same as above; 
2) same as above; 
3) same aS above; 


4) If the rule just found ig an alternation, match the 
user's command to the appropriate choice. Proceed with 


Step 4 as above and exit the algorithn; 
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5S) If the rule is not an alternation, follow Step 4 anwae 
earlier algorithao, change the CN to reference the node 
Just created, change the CP to reference the path to the 
new CNts first nonterminal son, and repeat this algorithn 


tron Stepwi. 


Note that Step 5 in the above algorithm requires finding the 
first nonterminal son of the (new) CN. What if this node 
has no nonterminal sons? Intuitive, such a node should 
have been created uSing an alternation rule, and thus Step 
4, not 5, should be executed. The reasoning behind this 
statement is that unless the user had the option to select 
the particular set of terminal sons from other choices, the 
nonterminal that produced these terminal sons is, in 
reality, Simply an abbreviation for that seguence of 
terminal sons elsewhere in the grammar; thus an equivaient 
grammar may be designed that removes these deviant cases. 
Appendix C provides rules for grammar design which resuit in 
the development of input grammars which avoid such problems. 

One additional feature of the SDE should be discussed 


here: it is possible to specify a rule with no synthesis 
Pabt,at ~ aie Such a rule is called an "identity rule" 
[Ref. 15: p.- 22], and is of the form A —--> B where =e 
Single nonterminal. The advantage of such a rule is the 


creation of a more compact tree without loss of semantic 
information, which in turn implies less memory requirements 
and quicker unparsing by the SDE as well as guicker inter- 
pretation or code generation after editing. Whenever such a 
production is encountered in the above creation algorithn, 
the nonterminal on the left side is replaced by that on the 
right and the rule for this nonterminal is used instead. 
(This 1S one reason the term "expected node" was used in 
describing what a path should lead to. If the rule stemming 


from an expected nonterminal iS an identity, then the path 


a0 


will lead toa different nonterminal node than originally 
anticipated.) Note that the right side of the production 
must contain only a single necnternminal. If it were to 
include any terminal information, such information would be 
lost during unparsing, because the node's syntax attribute 
would reference a different rule. 

The final matter to be resolved is how a user navigates 
through the tree. Chapter 2 defined the five legal moves as 
Parent, Child, Right, Left, and Rest of Sequence. Note that 
in ail cases, the user may only move to a nonterminal node. 
(User-defined terminals are accessed indirectly by moving 
the CP to the parent of the terminal, which 1S a grammar 
nonterminai.) "Right" and "left" refer to nonterninal 
brothers as indicated by a rule's analysis part, so that the 
user May give commands based on what he sees on the screen. 
(Recall from Section 3.C that a "right" brother in the anal- 
ysis part of a rule may actually be constructed as a "left" 
brother as dictated by the synthesis part.) PUGtHCE , 
"Chiid" selects the first nonterminal son of a node (again, 
"first" being determined by the rule's analysis part). 
Implementation of these commands is facilitated by the 
structure of the tree nodes, the creation of the nonterminal 
dictionary for each grammar rule, and the use of the CN and 
CP to indicate one's current fosSition in the tree. The 


commands are implemented as follows: 


Parent: Access the CN's "parent" reference. This reiter- 
ence becomes the new CN value, andthe CP is set to the 


first nonterminal child of the new CN; 


Child: Access the node at the end of the CP and set the CN 
to reference this node. set the CP to the path indicated 


by the first entry in the nonterminal dictionary for the 
new CN; 
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Right/ieft: Access the syntax reference in the CW node and 
Find the CP's path name in its nonterminal dictiOnamee 
Retrieve the preceeding (for left) or succeeding (for 
right) path name from the dictionary, aS appropriate. 
Traverse the CN's childiist until the corresponding path 
is found, then set the CP to reference this path. 


"Rest Seg" iS a movement command available only when the 
CP references an item in a Sequence of syntactic constructs. 
Such a seguence iS impiemented ina particuiar way, as 
Gescribed in Chapter 4, and the "hkest Seg" command utilizes 
that implementation. Discussicn of this command is tkere- 


fore postponed until the following chapter. 
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Aw. GENERAL 


The following sections present various aspects of tke 
SDE's implementation. First a general description of the 
SDE's manner of interaction with the user is provided, 
followed by a description of some of the SD&£' data types 
and primitive operations that act on or use these data 
[mpes. Each OL the SDE'S major processes are then presented 
in detail. Finally, tue SDE's frogram storage and retrieval 
functions are discussed. 

In studying the following sections, the reader will 
wonder why certain decisions were made concerning the SDE's 
implementation. Rather than discuss each such decision, 
this chapter will present the SDE as it exists, explaining 
trade-offs and decisions only where beneficial to that pres- 
entation. Chapter Five, which represents an assessment of 
the SDE, will identify strengths and weaknesses of the 
implementation and will suggest improvements where needed. 

AS a general comment, however, it should be noted that 
the SDE is more of a prototype than a finished product. AS 
such, it contains structures and procedures that are less 
than optimal in terms of efficiency. Some, such as the use 
of a linked list to house the grammar rules, were used 
because they were logically straightforward or easy to 
implement in Pascal. Much of the implementation, however, 
exists because the SDE was constructed based on the concepts 
presented in [Ref. 15], and the program inherited some of 
the conventions of that paper. For example, the input 
grammar format utilizes parentheses to delimit alternations 


and analysis and synthesis parts of rules, reflecting the 
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Lisp format used extensively in that paper. Data types 
which model "a-lists" (or "“asscciation lists," collectiigm. 
of attribute-value pairs) and "tagged a-lists" (a-lists Gdaen 
possessing a single "tag" field at the head of the 1ist) 
also mimic structures found in the original paper, and the 
SDE's algorithms are similarly based on those found in the 
reference. Following the approach outlined in [Ref. 15] 
facilitated the SDE's development. It is expected, however, 
that subsequent implementations will deviate more irom the 
details of previous work to preduce faster, more efficient 


products. 


Be. INTERACTION WITH THE USER 


Interaction petween the user and the SDE can ke divided 


into two distinct phases; display of textual information 
(the output of the SDE) and acceptance of the user's 
commands (which serve as input to the SDE). At the highest 


conceptual level, the SDE executes a cycie of displaying the 
current status of the program anda menu of appropriate 
commands based on that status, accepting the user's command, 
and implementing the command. Interaction is thus a serial- 
ized process of display and input. These two functions are 
discussed individually Lelow. 

Display of SDE output requires knowledge of the specific 
terminal type in use. The SDE must know how to activate and 
cancel anverse video, how to position the cursor at the 
beginning of the menu, and how to clear the screen and posi- 
tion the cursor at the top of the display. It must also 
know the number of lines and columns in the display. 
Because taking advantage of terminal-specific algorithms or 
capabilities would limit the SDE's terminal independence, 
knowledge of the required information is instead provided 


LEO me med user-Supplied external file (the "TERM tome 
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mentioned in Cnapter 2) that can be modified or replaced 
whenever a different terminal is used. The SDE reads the 
"TERM" file during initialization and forms four linked 
lists of integers whcse associated ASCII codes will, when 
sent in succession to the screen, cause the display to 
perform the four desired functicns. Thus, to refresh the 
display, the SDE writes the "clear screen" ASCII sequence to 
the terminal, followed by the Pascal "writeln" commands that 
display the program text; if the menu is to be displayed (as 
determined by the "dsply togl" value), the "move cursor" 
sequence is then sent to position the cursor on the menu 
portion of the screen, and additional "writein" commands 
display the menu selections. 

One advantage of this approach is that any sequence can 
be stored in the "TERM" file. Terminals that do not support. 
inverse video, for example, may have printable characters in 
their "TERM" files so that, when the SDE activates’ the 
inverse video function, these characters are displayed 
instead. Cancelling the inverse video may be done in the 
same manner. (An example of such a display gay be seen in 
Figure 4.1.) Another use of this feature involves the "move 
cursor" seguence that positions the cursor for the menu 
dispiay. On some terminals, this sequence may need to 
include instructions to clear the remainder of the screen in 
order to erase the previous menu. 

The input of user commands to the SDE is complicated by 
the use of Pascal aS a programming language. Lito was 
initially intended that the SDE refresh the screen with 
every character the user input so that he could see the 
immediate effect of his entry and view a current, relevant 
menu. However, Pascal input/output requires that a carriage 
return ve entered in order for prior entries to be read by 
the progran. Since entering a carriage return after each 


command would prove unnecessarily tedious to the user, the 
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beGian. 
-> integer 1 
a a eee DiS 
a 0 
while 1 <¢ 10 do 


~~ aa 


a ee ee ree 
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Figure 4.1 Sample Display for Terminal W/O Inverse Video 


SDE presently accepts a string of commands, up to 80 charac- 


ters in total length, before the carriage return need be 


sent. The string's commands are processed individually as 
if entered separately. The list of legal selections is 
updated after each selection is acted upon, although the 


screen (including the menu) is updated oniy after the final 


command is completed. 


C. DESCRIPTION OF SDE DATA TYPES 


The following paragraphs describe the data types used by 
the SDE to store grammar and program tree information. A 
formal definition of these data types, as expressed in 
Pascal, is presented in Appendix A. 

At the outset, the SDE's string representation must be 
explained. Because Berkeley Pascal string capabilities are 
limited, strings are stored as records having a "“wrd" field 
containing the actual characters in the string and a)" Hens 
field representing tne length of the striag. Note also that 
the "wrd" rield utilizes the Eerkeley Pascal “alfa type, 
which allows a maximum string length of 10 characters. 

Grammar rules are stored in a linked list of records, 


each of which contains the name of the rule (the nonternminal 
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on the left side of the production) anda pointer to the 
Tight side of the rude. If the rule is a Simpie production 
(with a single right side), the pointer references a defini- 
Eton Containing analysis, synthesis, and nonterminal 
dictionary parts. On the other hand, ii the rule is an 
alternation (as discussed in Section 3.D), the record points 
to a second iinked list whose members each contain the 
Single-character choice used to select the particular 
production, the display string used to represent the choice 
in the menu, anda pointer to the definition of the produc- 
tion. Figure 4.2 represents the first three entries in the 

lanked list of rules in the "Minigol" grammar in Appendix E. 


Note that all rules, simple or alternation, eventually 
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Figure 4.2 Storage of Rules in Minigol Grammar 
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A closer look at the definition part of a rule reveals 
that it consists merely of pointers to analysis, synthesis, 
and nonterminal dictionary parts. The analysis part iSa 
linked list of records representing terminals and nontermi- 
nals. (Note that the "info" field for a nonterminal is the 
Mame of the nonterminal, while the same field ina terminai 
record represents a string of characters, possibly including 
formatting commands, to be displayed or acted upon during 
unparsing.) The synthesis part represents a "tagged a-list" 
(as defined in Section 4.A) whose tag represents the node to 
be created in the tree and whose attribute-value fairs 
represent paths to expected children. The nonterminal 
Gictionary is an "a-list" whose entries reorganize _ the 
information contained in the analysis and synthesis parts. 

Note that all three definition portions contain an 
"affix" field. This is a record of three characters used to 
further describe the particular terminal or nonterninal. 
Nonterminals may be affixed with a Kleene "*", Kleene "ti, 
ellipsis (represented by a single "."), option symbol ("?"), 
or nothing (represented by a "&" because the SDE currentiy 
requires a non-blank character to be read). If an ellipse 
is indicated, a second character must be affixed to desig- 
nate how to separate the individual items in the sequence. 
Finally, if two or mere of the same nonterminal appear ina 
single production, a third character (a "prime" mark or a 
digit) must be provided to distinguish between then. 
(Actually, in such a Situation the distinguishing mark is 
the first of the three affixes.) Terminals need no affixes, 
Buiecne field is included for simplicity of typing and to 
prevent accessing nonexistent fields. (While a variant 
record structure could have been used, it would only have 
Saved the three bytes of storage required by the affixes, at 
the expense of more complicated logic as well as the space 


required for the variant structure itself.) 


5 


A significant feature of input grammars is the “wseweg 
the SDE "set" data type to specify the members of an alter- 
hation rule whose definitions contain only sSingle-character 
terminals on tne fright side of the productions. Fou 
example, the production "char -—-> a j| b | cc" can be Degeee 
sented as an SDE "set" instead of a linked iist of alterna- 
tives. Note that definitions eligible for representation as 
sets would have no nonterminals in their analysis parts and 
childless tag fields, identifying the selections made by the 
user, in their synthesis parts. To the SDE, such piroaiee 
tions perform no useful function except to record the user's 
selections. A more efficient treans of storing such prodauc- 
tions in a grammar (as well as an eaSier method for the 
grammar designer to write them) is to list them as the 
(logical) set of all terminals derivable from the left-side 
nonterminal. The SDE stores a grammar's sets in a linked 
list whose records each contain a set name (the name of the 
left-side nonterminal) and a {Pascal) set of ali the deriv- 
able terminals. 

The above paragraphs descrite the data types used by the 
SDE to Store gralher intormacren. The actual program tree 
is created and maintained uSing two general record types 
iinked together in a tree structure through use of pointers. 
The actual tree nodes are "programnode" records having name, 
syntax, parent, and childiist fied. The children of a 
programnode are accessed through the childlist field, which 
points to a linked list of the second record type, the 
"“childnode." Each childnode 7) (im sceurcur points to a single 
child (of type “"programnode"). 

Each programnode's syntax field contains a pointer toa 
definition in the grammar linked list. Note that even if 
the node were produced from an alternation, the syntax field 
would reference the specific definition from the list of 


alternatives. The parent field 1s a pointer to the node's 
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pInememiinetie tree), which 15S also of type “progrannode." 
The childlist field, as previously mentioned, is a pointer 
fomene first link of the childlist. 


The childnode is a record with three fields: Each, 
eld, and@next. The "next" field provides the linked list 
structure. The "patn" field is a string type and records 


the name of a path to a child of the parent programnode, as 
dictated by the synthesis part of the parent's syntax refer- 
ence. The child field is a pointer to that child, as 
described in the above paragraph. The children of a 
programnode are thus grouped through a linked list of child- 
nodes. The particular child in question is accessed by 
traversing the list until the appropriate path is found. 

Figure 4.4 represents a Minigol program tree segaent for 
an integer declaration. The figure includes three prcgran 
nodes, two of which are children of the third. The two 
children are accessed through two childnodes. (Note in the 
figure that the "t" childnode accesses an "int" program node 
rather than a "type" node, as expected in the grammar rule 
for declarations. This is because "type" is an example of 
an identity rule as described in Section 3.D.) 

Chapter 3 mentioned the use of "CN" and "CP" to keep 
one's place within a program tree. Note that CN is a 
pointer to a program node, while CP is a pointer to a child- 
node. In Figure 4.4, for example, the "integer™ portion of 
the declaration would be the current position if the CN 
referenced the "deci" node and the CP referenced the "t" 
childncde. 

It remains to be discussed how the above data types are 
used to implement the "speciai" grammar features described 
ioe section 3.C: the sequence conventions (Kleene "x", 
Kleene "4+", ACs) abd the optional node ("7"). 
Seguences are implemented by creating SDE-defined "seguence 


nodes." These are programnodes named "seq" having exactly 
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Figure 4.4 Minigol Tree Segment: Integer Declaration 
two sons, the first of which is the node type being 


sequenced, and the second of which is either a "nil" node, 
which ends the seguence, or another "seg" node. A seguence 
is thus igplemented as a chain of "seg" nodes to each of 
which is attached one item in the sequence. 

The "nil" node mentioned above 1S a programnode named 
"nil" haviny nil references in [oth its syntax and cnildlist 
fields. It is used both to end a sequence and to waive an 
optional node. For example, Suppose an "if" produceran 
contains an optional "else" clause. If the user elects to 
use this clause, then construction continues as if the 
clause were mandatory. If the user chooses to omit this 
clause, however, the childnode representing this path 1s not 


destroyed; rather, a "nil" node is created and referenced by 
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the chiidnode. The parent programnode thus retains paths to 
all of its children as dictated by its synthesis reference, 
even if some of those children are not used. Note that the 
absence of a "nil" node at the end of a childnode is inrtier- 
preted as an incomplete tree segment by the SDE. 

Figure 4.5 demonstrates the use of both "seg" and "nil" 
nodes. Note that the syntax references of the "seg" nodes 
refer to the rule that generated the sequence, i.e., the 
rule that created the parent of the entire sequence. Note 
also that the first path name in the "seg" node's childlist 
is inherited from this same rule. The second childnode'ts 
Mame, however, is always "next." 

Finally, user selections from a grammar set are stored 
in a unigue way. Sets, as stated above, are useful ina 
grammar to note the characters that can correctly compose 
such program entities as variable names, numeric constants, 
and subroutine names. Because the SDE checks only for 
syntactic accuracy, any character seguence is valid if 
syntactically correct -- that is, variables need not have 
been declared, scoping rules do not apply, and type clashes 
are irrelevant. Ln Short, the only important information 
contained in a hame is the name itself. It wouid prove 


wasteful of storage to allocate a separate node for each 


letter in each variatkle name. The SDE therefore takes a 
different approach by creating a "str" node. Unlike "seq" 
fee N21" nodes, a "str" node is not given the nane "str," 


but rather assumes the name of the particular set in gues- 
vO « This node has a syntax value of nil and at least one 
childnode. The first ten characters in the variable name 
are stored in the "path" field of this childnode, and addi- 
tional childnodes are created and linked together as needed 
to house the full name. Note that no programnodes are 
constructed at the end of any of the childnodes. The value 


of the childnodes is in the path names themselves. 
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De. SOME IAPLEMNENTATICN PRIMITIVES 


In this section are presented 
the SDE to 
Their understanding both facilitates 


dures used by peforn 
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"Orimacive! 





Sequence of Declarations 
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some functions and proce- 


operations. 


explanation of higher- 


level operations and provides insight into the organization 
of the SDE as a program. 

“Findrule" is a function that, given a name, checks the 
Mesto, Glammar CuLes to see if it includes a rule by the 
given name. If it does, a pcinter to the entire rule is 
returned; otherwise a nil value is returned. "Findset" is a 
Similar function that tries to match the argument with an 
entry in the list of grammar sets. The purpose of these two 
functions is to retrieve the rule or set entry in a grammar 
when only a name is known. The use of the nil value also 
serves as a Boolean indication that the name is not in the 
specified list. 

"Lookup" is a function whose arguments are a pointer to 
a program node and the name of cne of the node's childpaths. 
The function uses the node's syntax reference to access the 
definition that generated the ncde, then matches the name of 
the childpath with anentry in the nonterminal dicticnary. 
The function returns a pointer to this dictionary entry. 
"Lookdn," on the other hand, is called with a pointer toa 
programnode and a dictionary entry as arguments. The func- 
tion traverses the node's childlist to find a match between 
a childpath name and the name in the dictionary entry, and 
returns a pointer to the childnode thus found. These two 
functions are used in a variety of ways throughout the SDE, 
both individually and together. For example, when moving to 
a "right" brother, “lookup” returns the dictionary entry of 
the present CP; the entry's "next" field is accessed to 
obtain the right brother's path name; and “lookdn" is used 
to move the CP to the appropriate childnode. 

The last set of primitives to be discussed are those 
that create nodes in the program tree. "Makenode" isa 
function that returns a pointer to a new programnode (which 
is created as a "side effect" by the function). To make the 


node, the function is provided a pointer to a definition in 
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the grammar and the value of CN. It creates a new node with 
a name as determined by the “tag” of the dictionaeaa 
synthesis part, a syntax pointer to the definition itseli, a 
parent pointer to the CN, and alist of childnodes asiamage 
cated by the attribute-value pairs in the definition's 
Synthesiem oar. Any terminal sons, as indicated in the 
synthesis part, are also created at this time. 

"Makesegq" is a function that creates a "seq" node with 
exactly two chilidnodes. Since the syntax and pathname 
values are determined by the parent of the sequence and not 
by the expected child, no definition pointer is reguired as 
input; instead, "makeseq" is called by providing the CN and 
CP values. The CN provides the "seg" node's parent refer- 
ence, aS in "“makenode." If the function is creating the 
first "seg" node in a seguence, the CP provides the pathname 
which will be copied in the "seg" node's first childnode; 
otherwise, this information is copied from the parent 
("Seq") node's first childnode, and the CP is ignored. The 
result of this process is that the syntax reference and 
pathname provided by the original parent are passed down 
through the sequence as new "seg" nodes are created. 
Identifying what type of sequence one finds himself inis 
therefore a simple operation. 


"“Makenil" is a function that returns a pointer to a new 


node named "nil." The new node is also given nil values in 
its syntax and childiist fields. The purpose of this Ziuges 
tion is to create anode that "ties off" sequences or 
optionals that are not acted upon. Its Single argument 1s 


the CN value. 

"Makestr™ is a function that creates a programnode to be 
linked to a string of characters as specified by one of the 
sets in the grammar. The function is invoked by providing a 
pointer to an attribute-value pair (representing a child of 


the "tag" field node) in a rule's synthesis part; it returns 
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a pointer to a newly created node. The new node is given 
the same name as the "value" fortion of the input pair -- 
note that it wili always be the name of a set in the 
grammar. The new node is given a nil syntax reference, like 
the node created in "makenil," Eut is also given one child- 
node, the path name of which is initialized to an "open" 
symbol foliowed by blanks. (Later, as characters are added 
to the string, they will be placed in successive locations 
in the pathname or this childnode, andthe "open" symbol 


will be pushed toward the end of the word. As the last 


character in the word is occupied, a new cChildnode wiil be 
attached to the previcus one, and the string will continue 
in the patnname of this childnode. When the user ends the 


string with the "endstr" command, the "open" symbol will be 
changed to a "closed" symbol, indicating the completion of 
moe string.) Note again that unlike "seg" and "nil" nodes, 
whose name fields contain "seg" and "nil" respectively, a 
“str™ node is given the name ci the set whose members it 
contains in its childnodes. Also, since the user-provided 
string is stored in the childnodes, the names of these nodes 


acre not dictated by the grammar. 


Ee DETERMINATION AND DISPLAY OF LEGAL CHOICES 


Since the SDE is menu-driven, a mechanism is required to 
determine what selections should be made available to the 
user. If this information is retained by the program rather 
than Simply displayed and forgotten, it can also be used to 
insure the user's input command is iegal before it is 
processed, thereby catching errors early and eliminating the 
need for error detection and recovery in the command 
processing portion of the progran. For these reasons, the 
set of legal user commands 1s assembled into a linked lisi, 


displayed in the menu, and retained for comparison against 
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USC ea Ute This process is repeated for each individual 
command. 

The iist of legal commands is dependent on one's ioca- 
tion within the program tree. Note, however, that the 
commands to end the session and change the current depth, 
display toggle, or move toggle settings are always offered. 
The routine for determining additional commands is as 


follows: 


1) If the CN has a parent value (which will always be true 


unless the CN is the root node), add "parent" to the list; 


2) If the CP haS a program node at its end, add "erase" 


and “grab" to the ccnmand list erdperem. 


a) If the programnode at the end of the CP has a syntax 
field pointing toa valid nonterminal dictionary entry 
(and is therefore not a "str" or "nil" node), add 


“chiid" and “change focus" to thes acr, 


dp) If instead the CP points toa "“str" node (detected 
when "findset" returns a nen-nil value for the node's 
Mame), see if the string is closed; if it isn't, add 
"any x" (where "x" 1s the set name) and "endstr™ to the 
Last.: 


3) If, on the other hand, the CP has no node at its end, 
add "put" to the list and follow the iterative routine in 
Figure 4.6. This routine searches the sequence of prcduc- 
tions that will be applied at this point in the program 
tree until either an alternation or a set (which is really 
an abbreviation for an alternation) is found. (Note from 
Section 3.D that grammar design restrictions insure that 
one of these two conditions will always be met.) When the 
alternation or set is found, add the appropriate choices 


to the command list; 
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{declarations (see Appendix A for types): 
Meealrst; {nt—-dictionary areas) 
g: grammar; (for rules 


done: boolean; 
{Note x has already been initialized, using 
for the CP path} 


word:= x*.val; {EXPECTED NODE NAME} 
done:= false; 
io nde UlouGxe. Val) > {RETURNS RULE PIR OR 
while (g.<>_nil) and not done do 
if g*.isalternation then done:= true 
else begin 
Ms Gee deri NLdict ; 


TO 
else {RULE MUST BE AN IDENTITY 
SO FIND SUNGLE NONTERM} 
word: = g*.defn*.anal*®.info; 
R SET TO BE CHECKED, SO UPDATE G VALUE 


g:= findrule (word) ; 
end; {LOOP BACK TO WHILE} 


set to use. } 


ee A NGS i SE ee sen ee es n e 


word: alfarec; (with "wrd" and "len" fields) 


lookup," to reference the nt dictionary entry 


if x <> nil then word:= x*.val Aes S ome 
(oc DeGr ENTRY} 
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{Now, either g points to an alternation or is 
nil. If nil, then word holds the name of the 
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Figure 4.6 Routine to Find a Set or Alternation 


4) Tne final commands to be added depend on whether or not 


the CN accesses a "seg" node. If it does not: 


Moe r che GlCtiohnary Entry has non-nil "next" or "prev" 


mreids, daa "right™ or "left" to the command 


b) If the affixes on the dictionary entry 
Optional child ("2") or a Kleene "*", and if 
child presently at the end of the CP, add 
the command list (to end an empty seguence 


Optlon) ; 


5) If, on the other hand, the CN does access a 


G3 


ees es 


indicate an 
there 1s no 
Nendstr" to 


Or Waive an 


"seg" node: 


a) If tne CP points to the first Son of the "Seq" )itegs 
(i1.e€., to one of the items in the seguence), then add 


"right" (which means the next item in the seguence, to 


the user) and "insert before" to the list. Also add 
“rest seg" 1f the second scn of the "seg" node exists 
and is not "nil." Further, if tae CN‘sS parent is adem 
"seq" node, add ent. (The user should be able to 


travel left and fright to items in the sequence without 
knowing how the seguence is implemented. A "seg" node 
above the CN means the CP does not reference the first 
item in the seguence, so there are items to its left on 


the display); 


b) If the CP points to the CN's second son, add "eum 
to the 11st. If the CP also AaS nO program Node aimee 


end, add "endstr." 


Display of the legal choices on the menu is accomplished 
by accessing each node in the list of choices and displaying 
the contents of the node. Control characters are unprimage 
able on a screen, so they are displayed as the two-character 
sequence of "at and the printable key to be typed on the 
keyboard. If the display routine detects a set in the list 


of commands, "any" and the name of the set are displayed. 


Fe. PROCESSING THE USER*S COMMANDS 


Every command entered by the user is checked against the 
current list of legal commands. A command is determined to 
be legal if it either matches one of the commands in the 
command list or iS a member cf the set included in the 
command list (as discussed in Step 3 of the algorithn 
presented in the previous section). If the command is 
legal, it is fowarded to the command processing routines of 


the. SDa. If it is an illegal command, the rest of the 


70 


command string is igncred, the screen is refreshed, anda 


message informs the user that he entered an illegal 
selection. 

As mentioned in Section 3.D, user commands may be 
categorized as either "standard" or “special." Each copmana 


in the command string is first examined by the routine that 
implements standard commands. If this routine does not act 
on the command, it is instead frocessed by the routine that 
handles special commands. The "stdcmd" routine is therefore 
implemented as a Boclean function that returns whether or 
not it acted on the command; any action taken is performed 
as a "side effect." "Checkspeccads" is a procedure invoked 
only when it is already known that the command is nota 
standard command. 

Implementation of the standard commands "right," "left," 
"parent," and "child™” were discussed in Section 3.D. That 
discussion, however, does not apply when the CN is inside a 
Sequence. In such a Situation, "parent" moves the CN and cP 
to the beginning of the sequence so that the CN references 
the parent of the entire seguence and the CP references the 
path that leads to the sequence. The entire sequence is 
therefore nighlighted during sukseguent unparsing. Right" 
and "left" are implemented so that the CN, CP pair nove 
“down and right" or "up and left" in the seguence structure 
to reference the next lower or next higher item in the 
sequence. During unfarsing, the right or left item on the 
disnvlay is highlighted. The "child" implementation is unaf- 
fected by the presence of a Sequence. 

The "rest seq" command is implemented by simply shifting 
the CP to the second son of the CN. Since this command is 
legal only when the CN is a "seg" node, the CP references 
its first son, and the second son exists and is not "nil," 
moving the CP to the CN's second son positions it over 
another "seq" node. The CP thus accesses a "seg" node whose 


descendants include the remaining items in the sequence. 


eat 


"Erase" is another command implemented in two ways 
depending on whether in a sequence. Normally, the CP's 
Cchildpointer is simply set to nil, thereby losing the 
program segment previcusly referenced. (Note that the SDE 
performs no "garbage collection.") When the CP references an 
litem ina sequence, however, the higher and lower portions 
of the sequence structure are "tied together" so that only 
the single link is erased. 

"Change focus," "change depth," and "grab" are easily 
implemented. For "change focus," a variable named "focus" 
is set to reference the CP's programnode child, and the CN 
and CP are changed as if the "chiid" command were entered. 
Since the "focus" value determines the location in the tree 
from which unparsing begins, only this subtree is displayed. 
For "change depth," an integer is retrieved from the command 
line (or else the user is prompted for one), and this value 
is assigned to the "depth" variable. For "grab," ‘aig 
digit integer is again retrieved from the command line or 
from the user, and this value is used as an array index to 
Store 4a yvointer to the GP"s child 

"Put," the complement of "grab," is a more complicated 
operation. Pac oie the segment to be implanted must be 
compatible with the expected node at the end of the CP. 
Also, the nodes in the accessed segment must be copied and 
their parent references changed so that the instalied 
segment is completely independent of its model elsewhere in 
the tree. Compatibility is checked by comparing the root of 
the segment to be implanted against all the nodes expected 
at that point in the tree based on the grammar rules. set 
compatibility can be checked because the "str" node retains 
the hame of the set whose members Le contalias 
Compatibility of sequences is checked by comparing the node 
type of the items in the sequence. Duplication of the 


grabbed program segment is done through a "copy" routine, 


ez 


which dynamically creates new nodes and gives them all the 
attributes of the nodes in the grabbed segnent except those 
that form the tree structure (such as parent, hext, and so 
Oi) < The structure fields are given new values that refer 
to (and result in) the new structure. Note that because no 
garbage collection is performed when segments are erased, 
such segments still reside in main memory and are therefore 
available to be copied (if, of course, they were "grabbed" 
prior to being erased). 

The "insert before" conmand is implemented in an opfo- 
Site manner from the "erase" for sequences. Whereas "erase" 
removes a link from a chained seyuence structure, "insert" 
installs a link into the sequence. Parent and child refer- 
ences of surrounding nodes are adjusted to accomodate the 
new link. 

The implementation of "endstring" depends on whether a 
user-defined string is to be clesed or a sequence or option 
is to be ended. A string is closed by changing the "open" 
symbol at the end of the "str" node's string to a "closed" 
symbol. A sequence or option is closed by putting a "nil" 
node at the end of the CP. 

Finally, the toggle commands ("dsply" and "move") are 
implemented by negating the previous values of Boolean vari- 
ables. Using the "display toggle" also adjusts a variabie 
containing the number of lines cn the screen. When the menu 
display is disabled, the unparser plans the dispiay to 
utilize the larger screen size. 

The above paragraphs descrite only the standard commands 
input by the user. The special commands are implemented in 
an iterative procedure called "checkspeccmds." Note that 
whereas the routine to determine legal commands (presented 
in Section 4.E) stepped through grammar rules to find aa 
alternation or set, "Checkspeccmds" creates the nodes 


dictated by the rules along the way to act on the 
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alternation or Set. The coding in this routine followseeee 
algorithm for node creation as ;rresented in Section 3.D, and 
relies on the restrictions of input grammars as discussed in 
that section and in Appendix C. Figure 4.7 contains the 
kernel of the procedure. As mentioned above, "check- 
speccmds" continues to add nodes until either an alternation 
or set selection is found. The user's command is then acted 


upon, and the procedure terminates. 


Ge. UNPARSING: DISPLAY OF THE PROGRAM TREE 


Section 3.D presented a brief summary of the algoritha 
used by the SDE to convert the program tree to textual forn. 
This conversion is jperformed by the recursive procedure 
“unparse," which includes mechanisms to handle Sequences and 
Sets. When unparsing for screen display, "unparse" also 
activates the screen's inverse video (as directed in the 
"TERM" file) while the CP's descendant programnodes, which 
reflect the user's "current position," are being unparsed. 

As cone of its arguments, Nunparse" is passed a pointer 
to a node in the pregram tree. While this parameter is 
sufficient to effect conversion of the tree to textual forn, 
two additional parameters are required to insure proper 
display of the text on the screen. One is an integer repre- 
senting the current "depth limit," or the number of genera- 
tions of descendants the user wishes to be displayed. The 
other parameter is a Boolean value reflecting whether this 
invocation of the prcecedure is being made in the first or 
second "pass" of the unparsing effort, as discussed below. 

Unlike a text file, the terminal screen can only display 
a fixed number of lines at a time. Suppose a terminal 
displays 24 lines. If the SDE displays more than Gee 
number of lines, the first lines will be scrolled up and off 


the screen. Further, if the menu iS to be displayed, the 
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Figure 4.7 Portion of ‘'Checkspeccnds' 
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bottom few lines of the display are unavailable for program 
text. Finally, whatever 1s displayed on the screen must 
include the CP's descendants fcr the menu to be meaningful 
and for the user to view the program portion he iS manipu- 
ia taeirg.. The SDE solves these [roblems by making two passes 
through the unparser. In the first pass, the unparser 
converts the program tree to text but does not send it to 
the screen. Rather, it records the number of lines between 
the first line to be displayed and the beginning of the 
display of the CP's descendants, then calculates how many 
lines must be unparsed but net displayed so that, when 
display does commence, the CP's descendants will appear on 
the center line of the usable screen area. During Vaare 
second pass of the unparser, text is sent to the screen only 
after the calculated number of lines is unparsed. Likewise, 
unparsing terminates after a prescribed number of lines is 
displayed (to avoid scrolling). 

The SDE allows the user to select the depth of his 
display. As stated earlier, he can use this feature to view 
the entire program from a broad perspective or inspect a 


small portion in its greatest detail. The feature is iaple- 


mented by passing adepth parameter to "unparse," whicna 
decrements the value upon receipt. When "unparse" calls 
itself recursively, it passes the decremented value. A 


non-positive value causes the procedure to display only 
"..." and return; neo further unparsing is performed On vem 
tree branch. The effect 1s that unparsing proceeds to the 
desired depth and no further. Note that sequences are all 
considered to be of the same depth; thus a degradation of 
each succeeding item in the sequence is avoided. 

Seguences occur ina tree when a grammar rule specifies 
a nonterminal with an affix of We ts Oe ee Unparsing 
the first two cases 1S performed Simply by ignoring the 


"seg" nodes and recursively unfarsing their two. sons. Ina 


76 


GicmedscmOr est neweiiiersiS@wilth atfix of “."), the first son 
of the "seg" node is unparsed, the character used to sepa- 
rate the sequence items iS printed (if there are more items 
in the sequence), and the second son (another "seg" node) is 
unparsed. 

Note that the entire tree need not be unparsed every 
time the screen is refreshed. Rather, only the visible 
portion of the tree has to be unparsed. Unparsing therefore 
begins at the focus node, not necessarily the root node, and 
terminates when depth limits have been exceeded or the 
screen is already full. 

"Onparse" uses three formatting instructions from the 
input grammar. The grammar designer may specify formatting 
to effect "prettyprinting" by placing these single-character 
instructions in the terminal seguences of the analysis part 
of rules. The symbol "%" denotes carriage return, while "\" 
and "!" denote two-space indent and outdent, respectively. 
Note that "\" has two effects: it immediateiy causes the 
printing of two blanks, and also moves the unparser's record 
of left-most column justification two spaces to the right. 
The "!" symbol, on the other hand, only has tke effect of 
moving the left justification krack two spaces to the left. 
An example of the use of these symbols may be found in 
Appendix E. 

in addition to the formatting as specified by the 
grammar, the unparser also forces a carriage return and 
two-space indent (not affecting the left justification) 
whenever a line is too long for the display (as recorded in 
the "TERM" file). in this way the unparser avoids splitting 
of tokens onto two lines and keeps track of the actuai 


number of lines involved in the display. 
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He. STORAGE AND RETRIEVAL OF THE PROGRAM TREE 


The final implementation detail to be discussed is the 
storage of the program tree at the end of a session, and the 
reconstruction of tne tree from the stored form during 
subseguent sessions. 

Storing the program tree is done simply by writing the 
Mame of each node encountered during a preorder traversal of 
the tree. Terminals of the translation (as designated ina 
rule's synthesis part), seguence nodes, and "nil" nodes are 
preceeded by a guote mark. When childnodes with no nodes at 
their ends (signifying an incomplete program tree) are 
encountered, the expected node name is printed, Surrounded 
by the symbois "<" andwa Strings stored in sets are 
preceeded by an apostrophe and printed aS continuous 
strings, even if they extend through several path names. 
The "open" and "closed" symbols are also. printed at the end 
of strings. However, if the string was not closed, the name 
of the set type is aiso printed, surrounded by the "<" and 
">" symbols. Figure 4.8 shows the stored form of the 
"Minigol" program presented in Figure 2.2. Note that the 
letter "K" is used to represent the "closed" symbol; the SDE 
actually uses an unprintable control character for this 


purpose. 


"seq asn id '1K num 'OK "seg while reln lt | Sane 
num *10K block "nil "Seq asn aid "JA mut ia 
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a al al 
d 


ee | 





L. 


Figure 4.8 Stored Form of Sample Minigol Progran 
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Reconstruction of the tree from its stored form is some- 
what more difficult than storing the tree. The reading of 
the first character in each name reveals whether it is a 
string (the character being "'"), an incomplete tree segnent 
("<"), a terminal, "seq" or "nil" node (the quote), or the 
feag’  tleld cf a grammar rule's synthesis part. He sk jee Tamers 
tag field, function "findsyn" searches the grammar to find 
the responsible synthesis part. Using this synthesis fart, 
the SDE constructs a new program segment just as it would 
during an editing session. It also uses the synthesis part 
to learn how many sons the tag node has, so "readprogran" 
can call itself recursively the appropriate number of times. 

Encountering a "seg" node in the stored form causes the 
creation of a "seg" node in the program tree. Because a 
"seq" node always has two sons, "“readprogranm" calls itself 
twice recursively. All other names preceeded by a quote in 
the stored form are given nil syntax and chiid references 
and cause return from the recursion. 

Strings are also reconstructed based on the _ stored 
information. The name of the "str" node to be created is 
determined by using “lookup" tc determine the name of the 
expected node at this point in the tree. (The expected node 
will always either be a set crlead to a set through a 
series of identity rules.) The pathnames of the "str" 
node's childnodes are input frem the string in the stored 
mie. if the string is open, the subseguent nonterminal 
enclosed by the "<" and ">" symbols is read and ignored; its 
only value is to provide documentation to the user, should 
he wish to inspect the tree's storage. 

The only restriction this rcutine places on input gran- 
mars 1s that the tag field of every synthesis part must be 
unigue in order to guarantee that "readsyn" will find the 
correct synthesis part. Consider the grammar in Appendix D, 


for example. There are two productions from the nonterminal 


no 


Mat, so tnere are two synthesis parts whose tag fields cause 


the creation of a node representing "A" in the tree. Tnaese 
nodes must be unigue, however, so "ail" and "a2" are used. 
in tnis “particulem case, finding the incorrect synthesis 


part would nave destructive effects since the number of sons 
differs in the two rules. The program would thus either 
encounter too few or too many Nhames in the stored file. 
However, consider what would happen if two synthesis farts 
had identical tag fields and Similar children in order and 
type. "Findsyn™ would find the first occurrence of fnew 
and because the numbers and orders of children agree, the 
tree would be constructed without abnormal termination. The 
Syntax reference of the node created would reference the 
rule found by "findsyn," with the result that the text 
unparsed from this portion of the tree would reflect the 
analysis portion of the first rule, not the one intended. 
The text form of the program thus would be incorrect and 
misleading. 

On the other hand, the SDE's manner of tree storage and 
reconstruction gives the editor the potential for transla- 
tion between programming languages. This potential will be 


further discussed in Chapter 5. 
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A. MEETING SDE DESIGN REQUIREMENTS: RHADE-O FES AND 
SHORTCOMINGS 


The SDE was designed to meet several objectives. It was 
to be highly interactive to facilitate ease of use, yet be 
table-driven both to support a number of programming 
languages and to enable use on any terminal. AS a progran, 
it was to be portable from one operating environment to 
another and facilitate certain trodifications a user may wish 
to make to the SDE. Finally, a general objective of the SDE 
was to be as easy to use asa text editor: the user should 
hot have to "pay" for the advantages of syntax-directed 
editing by enduring the clumsiness of the tool. 

The SDE has many features that achieve interactivity, 
perhaps the most obvious of these being itS Menu. It was 
felt that users taking advantage of the SDE's multi-language 
Capability should not be required to know the details ci any 
input yrlammar; menu display was therefore appropriate. 
However, menus often become burdensome in interactive 
programs, especially as the user gains familiarity with the 
program and no longer needs to be reminded of his options. 
For this reason the SDE permits suppression of the menu, an 
option which has the added advantage of presenting a larger 
part of the edited program on the display. 

The use of two passes through the unparser involved a 
design trade-off to enhance interactivity. Only one pass is 
required to convert the program tree to textual form. 
However, experience with this approach revealed that the 
unparser often scroiled needed information off the screen, 


hiding the program portion being edited and making the menu 
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obsolete. To combat the situation, tke user was forced to 
repeatedly alter the focus and depth settings co posSieaae 
the display as desired. While one solution to this probien 
would have been to display only a constant number of 
programnodes on the screen, there iS no correlation between 
the rule referenced by a programnode and the amount of space 
taken by that rule in the display. In the "#inigoues 
grammar, for example, the "type" rule only reguires the word 


"reai" or "integer" on a single line and lets other rules 


continue on the same line, while the "block" rule reguires 
at least two lines all to itself. (This example demon- 
strates a Common, known grammar. One can imagine that a 


user-designed grammar could contain even greater deviation 
in space requirements.) A two-pass approach was therefore 
taken as described in Chapter 4. The trade-off involvedwame 
that the two-pass uaparse iS obviously less efficient than a 
one-pass effort. Indeed, early work showed that interac- 
tivity was noticeably affected by the delay between user 
input and SDE response, mostly because the SDE kept the 
focus node (from which unparsing begins) at the root of the 
tree until the user changed it. This meant that a louwom 
unparsing was taking place to display only 18 lines of text. 
Efficiency was improved by modifying the SDE to automati- 
cally adjust the focus node closer to the CP (by a heuristic 
number of nodes) when appropriate. This made the two-pass 
approach acceptably yuick, and actually reduced proyramaing 
effort in other portions of the SDE by keeping the focus 
node above the CP in the _ tree. Note that were this 
violated, as could occur when moving the CP to the "parent" 
of a seguence without also moving the focus, the display 
would not contain the CP at all, and the menu and display 
would be meaningiess to the uSer. 

As mentioned in Chapters 3 and 4, the use of the 


"command string" is another trade-off forced py the use of 


ay 


Pascal as a programming language. ideally, =the SDr& should 
act on the program tree and refresh the screen (including 
menu) with every keystroke the user maxes; Pascal, however, 
does not permit such an anput method. Bveme Li at did, 
however, the refresh rate of mcst terminals would make this 
approach unacceptably slow. Entry of a string of commands, 
followed by a single carriage return, offers faster editing 
at the expense of outdated menus and text displays. It is 
offered as a temporary solution until either terminal tech- 
nology improves or a significantly different display algo- 
rithm is developed for the SDE. 

Inherent in the requirement that the SDE be interactive 
is that it allow storage of one's work for subseguent 
editing. Unless useful to an interpreter or compiler (as 
mentioned in Chapter 1), the arsed program file is useful 
only to the SDE itself; the user seeks the text form. The 
tree need never be stored, therefore, if the user always 
writes complete programs (which will never be modified) in 
Single editing sessions. Such a restriction on progran 
writing is, of course, unthinkable. The tree is therefore 
stored to enable progressive development of programs by the 
user. AS mentioned in Chapter 4, however, the SDE's routine 
to recreate a tree from the stored form requires that the 
input grammar have unigue tag fields in its rules. While 
this is an acceptable restriction on grammars used only by 
the SDE, it may have consequences on interpreters or compi- 
lers seeking to act on the same stored forn. 

Pascal does not permit storing of pointer values, so the 
tree could not be stored in any direct manner. Text storage 
was Chosen because it facilitated user inspection as well as 
tree recreation. Inspection of the stored form is hindered, 
however, by the choice of "open" and "closed" symbols at the 
emanofrf strings. These symbols must be unprintable control 


Characters to prevent a user's input characters from being 
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misinterpreted. However, as ccntrol Characters they maya 
displayed in unpredictable, terminal-dependent ways -- in 
fact, they may be interpreted as commands to the terminal, 
altering the display itself. This Situation may be easily 
remedied for a particular terminal by selecting different 
"open" and "closed" symbols (as explained below), but a 
universal solution is lacking. 

A second requirement of the SDE was that it be table- 
driven. It has been shown that the SDE iS Capableu=oe 
supporting virtually any context-free grammar. The restric- 
tions placed on grammar design in Appendix C limit only the 
representation of such grammars but do not restrict the 
class of grammars representable in this forn. Appendix B 
lists the format reguired by the SDE to input such grammars. 
Note that this format is itself a context-free "grammar" for 
the "language" of SDE input grammars. The SDE can therenteme 
ke used to write grammars for itself. Appendix B includes 
the syntax of input grammars in a format Suitable for input 
to the SDE. It was used, in fact, to generate the "Minigol" 
grammar in Appendix E. As with any "program" written using 
the SDE, the grammars in Appendices Band Eare the text 
(unparsed) forms created in editing sessions. 

The anput grammar format is admittedly awkward to the 
human reader. Certainly better grammar representations 
exist, such as Backus-Naur form and Argot [Ref. 20]. it is 
possible to use the SDE to create grammars using these nore 


desirable formats as follows: 


1) Write (by hand or using the grammar in Appendix B)ae 
input grammar to the SDE that describes the desired 
oO biplane. The synthesis parts of the rules in this grammar 
must exactly match the synthesis parts in the grammar at 


Appendix B (except as discussed in a later section). 


2) For every programming grammar to be designed: 
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a) Initialize the SDE using the product of (1) above as 


nee plc aliaL, 


bD) Write the programming grammar during the editing 


session; 


c) Save the PARSED FORM of this programming grammar and 


terminate the session; 


d) Re-~initialize the SDE using the Appendix B grammar as 
the input grammar andthe parsed form saved in Step 2c 


above as the input progran; 


ele Save thewsfEXT FORM of the tree just created and 


terminate the session. 


The text grammar created in Step 2e above may subsequently 
be used as an input grammar to the SDE to create programs in 
this new grammar. 

. The key to the above technique is that the stored form 
created in Step 2c is suitable as an input program when the 
SDE is uSing either the grammar written in Step 1 or the 
grammar in Appendix BB. Thus the SDE is used to transiate 
programs written in one grammar into acceptable programs 
under another grammar. Memo e Ss) translation Capability is 
explained further in a subsequent section. 

Terminal independence has been achieved for the most 
Part . Use of the "TERM" Eile frovides a degree of informa- 
tion hiding in that the SDE achieves desired display effects 
without any knowledge of how they are implemented. 
Maintaining a set of files in the "TERM" format also enables 
easy transition from one terminal type to another. The 
Simpiicity of the format also facilitates user-adaptation of 
hew terminals. Appendix F lists the procedure to create a 
mee ah" rile. 
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Terminal independence was not fully realized in a 
different sense, however. The three symbols "\", 92" Sama 
mrw were selected as sSpeciai formatting commands because 
they are not likely to be useful symbols to a grammar. 
However, making them special ccmmands not only renders then 
unavailable for any other purpose, but also means that any 
terminal lacking these keys can not be used to specify 
output formats for new grammars. Like "open" and "closegme 
however, these Ssymbcls may be modified py the user to adapt 
to his particular environment as explained below. Mote that 
the format commands should be frintable characters (unlike 
“open “and "elosed™) so that the user can inspect his 
grammar and effect modificaticns. A way to avoid the 
unavailability of the selected keys wouid have been to use 
two-stroke commands to direct formatting. The sequences 
could be common keys without making these keys unavailable 
to the grammar designer for other purposes. "SR" Coulee 
represent "carriage return," for example, and both keys 
could be used in other terminal sequences (excert aS a 
pair). 

The SDE was developed in a Unix environment under 
Berkeley Pascal. While the SDE is not claimed to be written 
in purely "Standard" Pascal (if indeed such a tJlanguage 
exists), the program is designed to minimize taking advan- 
tage of its environment's uniygue facilities. It makes no 
direct calis to the operating systen, althougo it does 
invoke the "argc" and "argv" routines to obtain parameters 
from the Unix command line (See Section 2.A). These calls, 
however, are purely for the user's convenience and can be 
removed from the SDE without affecting the rest of the 


program. In fact, the Unix environment actualiy hinders the 


SDE"S intebaetiyrrys The reader has no doubt been confused 
over the selection of Such meaningless keystrokes as 
control-f for the “chald™ | commamnae Such keystrokes were 
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selected only because more meaningful commands (such as 
cone lr Ol=C) are intercepted by the . Unix operating systen 
before being passed to the SDE; under Unix, control-C causes 
HoOLtVON Of Program execution. Indeed, (capital) Ren Ow 
"rest of seguence"™ was selected because all the other 
control characters either were taken by the SDE or caused an 
undesirakle effect through Unix. The SDE copes with this 
problem and at the same time offers extreme modifiability to 
the user by declaring these command keystrokes as variables 
which are defined during program initialization. When the 
SDE is moved to another operating system, the installer can 
redefine all of the SDE's standard commands to whatever keys 
he wishes. He can also redefine the "open" and "closed" 
symbols placed at the end of strings as well as _ the 
carriage-return, indent and outdent formatting commands. It 
is reccmmended that all of these keys (except the formatting 
commands) be left as control characters, however, because 
designation of printable characters renders these characters 
unavailable for grammar or program design. Note also that 
Changing the "open," "closed" or formatting commands between 
sessions will render previous grammars or parsed forms 
unusable. 

Finally, the SDE was intended to be as useful as a text 
editor. Toward this objective, the SDE has had mixed 
success. There are many things a syntax-directed editor can 
do that a text editor can not. In fact, as mentioned below, 
a syntax-directed editor even offers certain text-editing 
facilities not found in a text editor. However, the present 
SDE also lacks certain features considered crucial to 
successful program development. First, it is slower than a 
text editor in that, in most cases, the time required to 
enter a program under the SDE will be greater than under a 
Becae editor. This is due primarily to the command line, 
which provides feedback only when the display is refreshed. 


Se, 


A much more desirable facility would be to exhibit the 
effect of each keystroke immediately on the display, prefer- 
ably without having to redraw the entire screen. Suc haa 
facility, however, would be extremely complicated, nay 
require extensive terminal-dependent features, and may even 
require Smarter terminals than fresently available. Another 
reason for the SDE‘'s slowness is the use of control 
sequences to effect commands; use of "arrow" or function 
keys would be better, but they are not offered on all termi- 
nals. A third reason for tae SDE's slowness is inherent in 
its one-way mapping from tree te text: one achieves movement 
about the display only indirectly by moving through the 
tree. These movements are often unintuitive and cause a 
jerky motion on the display. | 

Aside from speed, another advantage of text editors over 
the SDE is that the user may enter comments under a text 
editor; no mention of such a facility has been made in this 
paper. Comments were not included in the SDE's capabilities 
because it was felt there are no standard conventions for 
entering then. Some languages, such as LISP, FORTRAN and 
assembly language, use a delimiter such as "3" or WG 
indicate that the rest of the line is a comment. Others 
such as Pascal use beginning and end delimiters such as "{" 
and "}", and everything in tetween these delimiters 1s 
considered a comment. Some versions of PL/I require 
comments (enclosed by "/*" and "*/") to start and end on the 
same line. Additionally, of course, individual users will 
have their own preferences on how to display their comments 
within the confines of these conventions. All of this 
implies that comments may not be made a part of the SDE 
program but must instead be indicated in the particular 
grammar file being’ used. However, tne only ‘facility the 
input grammar format hasS at present is to include an 


optional "comment" node in every production in the grammar. 
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This wculd greatly increase the size or the program tree and 
also slow down the editing session (because the user would 
have to waive most of these optional nodes). One feasible 
solution would pe to include a "comments" command in the set 
of standard commands, but input the details of displaying 
such comments in the grammar file, possibly ina special, 
identifiable production with a different notation. This 
approach would probably require redesign of the programnode 
structure to reflect the presence or absence of a comment at 
this location in the tree. Sinply extending the childlist 
would not suffice because the SDE relies on the grammar- 
provided knowledge of how many children to expect for each 
node. Further, i1t is doubtful whether the comments should 
be kept as part of the tree at all, particularly if the tree 
is to be skared with other tools that ignore the comments 
anyway. Sandewall [Ref. 3] pecints out that comments nay 
create memcry management problems (such as fragmentation in 
virtual memory systems) if the commentsS comprise a signifi- 
cant percentage of the text. A more economical implementa- 
tion might be to maintain the comments in a separate file 
and include index references to this file in the progran 
tree. However, such a systegr might perforn slowly when 
required to display comments in the context of the progran, 
such aS in unparsing or prettyprinting. Further research on 
implementing comments was not ccntinued in the SDE develop- 
ment; at present the SDE has no comment capability. 

RiewovE aiso tacks a "search" facility. Whereas text 
editors can perform an exact pattern match between the 
user's search string and the text, there is no such capa- 
bility when searching a tree cf nodes -- the text exists 
only on the screen. Related facilities such as "global 
replace" are also lacking. There are several possible solu- 
tions to this shortcoming. Past; the SDE can actually 


unparse the entire tree into a file, then read the file and 
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perform pattern matching. However, such a strategy would 
only be able to acknowledge whether a pattern was present, 
then display this pattern and its surrounding strings on the 
screen; the SDE would be unable to relate this pattern toa 
particular node in the tree because of the one-way nature of 
Mapping from tree to text, although perhaps a complicated 
approach could effect this. The SDE would certainly be 
unable to effect a replacement of a random string with 
another because only user-defined hanes and grammar 
templates may be syntactically changed. A better strategy 
would be to provide a limited search capability restricted 
only to user-defined names, and perhaps to maintain a table 
of such names with pointers to all occurrences of them in 
tne tree. This would provide rapid response to the search 
guery and permit transactions (Such as repositioning the CP 
or changing the name) to be made on the tree. Searches for 
random strings such as "for 1 :=" could not be implemented 
this way, but searching for the variable "i" would eventu- 
ally find the desired locations. A third strategy would be 
to enable the SDE to perform "parsing" of the search string, 
then to perform pattern-matching between the tree segment 
thus created and the program tree. Note, however, that none 
of these solutions offers as great a facility as is pres- 
ently available in text editors. On the other hand, the 
second and third approaches offer a certain advantage in 
exchange for their inflexibility in that they will retrieve 
only those matching patterns that are also the proper struc- 
tures in the grammar. Whereas a text editor will retrieve 
every occurrence of "i" ina program (including keywords 
such as "is" and "in"), the above methods will retrieve only 


the variables named "i." 


310 


Be. IMPROVEMENTS AND EXTENSIONS TO THE SDE 


Like most useful programs, the SDE is not a_ static 
product. It has evolved since the beginning of this 
project, undergoing several significant improvements. One 
such improvement was the use of pathnames to house string 
aie Oormatlon. This modificaticn not only improved storage 
efficiency and eased grammar design, but it also was imple- 
mented within the data structures already established, 
fac lL Litating the modification as well as progran 
comprehension. 

The SDE is therefore a growing, changing product, and 
certainly it is not perfect in its present state. The 
shortcomings cited above must be overcome; there are addi- 
tional modifications that should be made to improve the 
efficiency with which the SDE presently functions; and 
further modification should be performed to give the SDE 
additional features and capabilities. 

One efficiency improvement that can be made is the SDE's 
implementation of sequences. Sequences are presently imple- 
Mented through chains of "seq" nodes each referencing 
exactly one item in the sequence. While there are certain 
advantages in knowing that each "seg" node has exactly two 
sons, it would be more efficient to link all items ina 
Seawence as sons of the same "seg" node. Moving to right or 
left brothers or toc the parent would be facilitated; 
however, other commands such as "rest seg" would have to 
implemented differently. Program tree restoration from the 
stored form would also have to tke altered, since the number 
of recursive calls to "readprogram" would not be known at 
the time the "seg" node was read. 

Another improvement that can be made inthe SDE is to 
reduce the storage overhead of grammar rules. FOr example, 


rules presently use Pascal strings to describe nonternminals 
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in their analysis, synthesis, and nonterminal dict iomam: 
parts. These nonterminals always reappear as names of other 
rules. A more efficient implementation would be to store 
pointers to these rules in the dictionary parts that refer- 
ence them rather than using the names themselves. The over- 
head of establishing pointers would be paid only once per 
editing session during initialization. While pointers would 
add indirection to operations such as displaying nonterninal 
names from an analysis part, it wouid eliminate the overhead 
of string-matching such as in "findrule." Another such 
improvement would be the use cf integers instead of ten- 
character strings to represent path names in the synthesis 
and nonterminal dictionary parts of the rules. in faete 
such pathnames could be generated sequentiaily by the SDE 
for each rule; the grammar designer would not have to plan 
for them or write them at all. Use o&f integers would not 
only be more efficient for the grammar designer and for 
Storage management, it would also improve the efficiency of 
looking down childlists to find a match between a given path 
name anda particular childnode. "Lookdn" could Siampiy 
traverse the childlist the apprcpriate number of childnodes, 
as determined by a path's integer "name." 

Another improvement to be made in the SDE is the storage 
of user-provided strings. While the present implementation 
represents a Savings over a character-by-character implenen- 
tation, it remains wasteful. By making the "childnode" 
structure a variant record, it could either reference a 
programnode or a new structure designed especially for 
string information. The "str" node and the establishment of 
entire childnode records could te avoided. Another improve- 
ment to string storage would be to include a "length" field 
in the record itself, which would eliminate the present 
delay in searching the entire stringy to find the "open" 


symbol when adding new characters, as well as provide the 
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unparser's formatting mechanism witn the length of the 
String more quackly. 

The need for alltag fields to be unigue can be elimi- 
nated through application of artificial intelligence tech- 
higques. When restoring a tree, the SDE could determine 
which synthesis part were referenced based on the nature of 
the attribute-value children of the tags in the rules. 

Another obvious modification to improve the SDE's effi- 
ciency is to eliminate the use of two record types in the 
program tree. The childnode cnly contains a pathname, a 
pointer to a programnode, and a link to another childnode. 
The programnode pointer field can be eliminated without loss 
of information by adding the other two fields to the 
programnode structure itself. This would save memory space 
and make the tree structure more understandable. Current 
SDE use of the CN andCP values, of course, would need 
revision. 

famably, the SDE"S lack of “garbage collection" should 
be corrected to improve efficiency of memory use. While not 
disposing of program segments that have been "erased" by the 
user avoids more complicated logic within the SDE- (for 
example, inthe "grab" function, which capitalizes on the 
present implementation), it can be extremely wasteful if a 
large amount of editing is being performed. Past experience 
with the SDE has been limited to operating on a VAX 11/780 
Minicomputer in the editing of small programs, and system 
performance has not sufferred under the present implementa- 
Lon . However, the SDE should be made more economical to 
insure acceptable performance both in editing larger 
programs and in operating on smaller computers. 

In addition to improvements which increase the effi- 
ciency of the SDE, there are also some extensions to the SDE 
that can be made to make it amore useful product in an 


interactive programming environment. For example, the SDE 
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can be made a more powerful tool by planning to handle 
certain semantic information as well as purely syntactic 
detail. For instance, most programming languages are 
categorized into only a few groups based on the type of 
scoping they utilize. SNOBOL, APL, and traditional Lise. 
dynamic scoping, while PL/I, FORTRAN and the ALGOL family of 
languages use static scoping [Ref. 17: p. 38]. If the SDE 
were so informed, it could perform type or scope checking 
during the editing session, and discover undeclared vari- 
ables for statically scoped languages. Another example of 
semantic help provided by a syntax-directed editor would be 
the checking of parameter lists. Most languages pass paran- 
eters either by name, reference, value, or some combination 
of these methods. Knowing, for example, that Pascal imple- 
mMents both "pass by value" and "pass by reference," a 
syntax-directed editor could detect the passing of a literal 
constant by reference (which 1s a security violation because 
it risks altering the value of the constant) and so inform 
the user {[ Ref. 2:2 op. 2212 It should be noted, however, 
that the more such ckéecking is "built in" to the editor, the 
less likely it is to retain its generality as a "universal" 
Seda Tors 

One class of languages for which the SDE could provide 
some very useful semantic information is the set of grammars 
it may generate. The input grammar in Appendix B guarantees 
the creation of syntactically correct grammars, but Gigs 
alone does not insure that the grammar produced is usable. 
It does not insure, for example, that the nonterminals in an 
analysis part agree in name and affix with those in the 
synthesis part of a rule, or that all such nonterminals 
appear as rules or sets elsewhere in the grammar. The SDE 
presently detects certain grammar errors during initializa- 
tion, but lacks complete semantic checks and error recovery. 


Further, it would be more desirable to detect such errors 
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during grammar creation rather than at attempted usage. To 
be a better tool, therefore, the SDE shouid include a 
routine that insures grammars are correct as they are being 
created. The semantics of such grammars are very simple, 
and it is anticipated that such a routine would be small, 


efficient, and easy to implement. 


C. IMPLICATIONS OF SYNTAX-DIRECTED EDITING 


This paper has presented an implementation of a tabie- 
driven syntax-directed editor. While such an editor is 
intended for use in creating syntactically correct programs, 
syhntax-directed editing offers capabilities that extend far 
beyond this function. 

The grammar-generating grammar in Appendix B offers a 
rapid means with which language designers can examine their 
research grammars. When contegrplating a new language, the 
designer can input his ideas into the SDE, then use it to 
quickly see a textual example of his grammar. A grammar 
checker as proposed in the previous section would further 
insure that his grammar was complete. Further, the SDE can 
display programs written in an incomplete grammar (up to a 
point), so that the grammar may be developed incrementally. 
Finally, the designer may use the SDE to create progran 
trees uSing hiS grammar, allowing him to insure that the 
desired semantic information can be derived from his source 
language. Lacking a syntax-directed editor, these results 
could only be obtained after developing a scanner and parser 
for his language (although parser-generators exist to help 
in this effort) and then writing complete programs in his 
hew ianyguage using a text editor. 

While use of a syntax-directed editor as described abcve 
facilitates the development Siattrad@tiohal” grammars, i.c., 
those restricted to the class cf£ LR(1) languages [Ref. 17: 
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p. 261], syntax-directed editing implies a far greater range 
of possible programing languages in the future. In @ie 
past, language design has been limited to those languages 
that can be parsed effectively. Syntax-directed edataime 
removes such a limitation on grammar design, since the 
editor is told by the user which production to apply ata 
given point in the progran. The program tree is generated 
directly during program creation, not through parsing a 
COME. The implication is that future languages may be 
designed to be more readable to the human reader rather than 
to a parser. Consider, for example, the use of the "where" 
clause in the applicative expression notation "L where X = 
Mz" in which the "where" clause binds variables in L, as 
listed in xX, to values derived from the expression 4 
RRSG =) 223500 ee al This is difficult to parse Convenor. 
ally, since binding details are provided after the variables 
are used and because determination of which production to 
use requires "look-ahead" of as many tokens as are in lL. 
Such a notation presents no difficulty to a syntax-directed 
editor, however, Since the user provides determinism by 
directing the application of the intended production. Note 
also that syntax-directed editing eliminates the need for 
current lexical conventions, Since no scanning need be 
performed. Thus, avariable named "this is a single vari- 
able" is perfectly acceptable when input through a syntax- 
directed editor. 

Even the one-dimensional concept of a program as a 
stream of characters is made obsolete by syntax-directed 
editing: two-dimensional languages may be designed, using 
symbols difficult to parse traditionally. For example, the 
eqguations in Figure 5.1 are unacceptable to a traditional 
parser, yet can be parsed and displayed by a syntax-directed 
editor with the proper formatting commands. in shoime 
syntax-directed editing opens an entire realm of programming 


languages previously considered impossible. 
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Figure 5.1 Examples of Two-Dimensional Equations 


Syntax-directed editors alse have an implication in the 
education of computer programmers. It would be unnecessary, 
for example, to instruct the syntax of any language. 
Whereas the Pascal student must currently know that state- 
ments always end with a semi-colon, this need not be known 
by a student using a syntax-directed editor, since the 
editor installs the semi-colon automatically. The student 
can thus learn the semantics of the language more readily, 
being freed from syntactic concerns. 

The use of a syntax-directed editor as a translator has 
already been demonstrated in Section 5.A. There, the SDE 
was used to translate grammars from Argot into the input 
format of the SDE. Translation between grammars is possible 
when both grammars have corresfonding rules with identical 
tag names and child patterns in their synthesis parts. They 
need not be completely identical, however. Note, for 
example, that there is a slight difference between the gran- 
mars in Appendices B and H in that the second alternative 
production of the "rule" alternation has a "+" affix on the 


"choice" son in Appendix B, while the affix is ".;" in 


oy 


Appendix H. Since toth are represented identically ina 
tree, this difference 1s uninpecrtane Note that even the 
word "choice" need not have been duplicated, so long as the 
new name referenced a rule that nad the same tag name and 
child pattern. 

Translation is aiso  poseilpre between higher-level 
languages, although such translation depends on the nature 


of the two languages themselves. It should be possible, for 


example, to translate a subset language program into a 
superset language progran. Some translation within a 
"family" of languages should also be possible. Appendix I 


lists modifications to the Minigol grammar to make it 
compatible with the Pascal sukset grammar of Appendix J, 
which may be perceived as a grammar for Pascal functions and 
procedures in that it allows declarations of variables 
before a master "“begin-end" block. Using these two gran- 


mars, the Pascal program segment in Figure 5.2 may be trans- 


| var, 
23 antegor, 
J; anteger; 
begin 
2 ae ae. 
while i 
begin 
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end 
end; 
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PaLguie. oa Pascal Version of Figure 2.2 


lated into the Minigcl program segment in Figure 2.2 simply 
by storing the parsed form under Pascal and reconstructing 
the tree under Minigol. The nature of this translation 


bears closer cxaminartion. 
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The Pascal grammar in Appendix J haS similar synthesis 
parts to those of the Minigol grammar in Appendix I with one 
notable exception: theme oke@ck 2" rule. In the Minigol 
Meammar, the Synthesis part of this tule creates a “block2" 
node with paths to two nonterminal sons, the "decl" sequence 
and the "stateblock"™ node. In the Pascal grammar, however, 
this synthesis part creates paths to a nonterminal "state- 
block" node anda terminal named "nil." This is because 
Pascal blocks do not include a "declarations" part while 


Minigol blocks do. (For example, Minigol would allow decla- 


ration of additional variables above the "j:= 1 * i" state- 
ment.) The interesting relaticnship between these grammars 
is that any program written in one may be displayed (or 


translated) using the other gramgmar. In the curious case of 
translating a Minigol program (witn declarations in its 
blocks) to Pascal, the declarations Simply disappear. This 
occurs because the "readprogranm" routine in the SDE recur- 
Sively constructs the tree from the stored form based only 
on the expected number of children for each tag according to 
the synthesis part that generated the tag. Sincenene Child 
patterns are the same in both grammars, "readprogram" reads 
the correct number of children and places tnem correctly in 
the tree when it encounters then. TimMcmrismoAarELeCular case, 
however, instead of reading aterminal named "nil," the 
routine encounters a sequence of declarations, and since 
declarations have their own rules in the grammar, the 
routine continues to process the file. In other words, the 
declarations are still present in the Pascal-constructed 
tree. The reason they are never seen by the user is because 
unparsing is dictated by the analysis, not synthesis, parts 
of the rules. Thus, since the analysis part of the "block2" 
rule in Pascal does not mention a "decl" sequence, it is not 
listed in the nonterminal dicticnary for that rule, and the 


unparser bypasses this child branch as if it were a terminal 
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Or the transiatwon, The implications of this invisibivie 
are that the Pascal grammar should be treated as a subset of 
the Minigol grammar, even though the two are not usually 
considered this way, and that programs should never be 
translated into Pascal from Minigol. Such a translation 
would risk causing other tools working directly on the tree 
to encounter inappropriate nodes; even tools acting on the 
text form would encounter diftiiculicy Since variables 
declared in the (Minigol) blocks would have never been 
declared in the (Pascal) text fcrm of tne progran. 

Stiil another type of translation offered by Syitaws 


directed editing is translation between representations of 


the same language. (Ref. 23], for example, describes four 
Walterhnative syntactic forns' aor a particular obgequs 
oriented language. Because all four share an indentical 


program tree format, translation from one form to another is 
a Simple matter of reading tne stored form using the grammar 
of the desired representation. Note that translation 
between forms of the same semantic languaye does net involve 
the problems mentioned above. However, since one of the 
four forms 1s two-dimensional, it may require special 
formatting commands to display a text program under this 
grammar. 

All of the above capabilities of syntax-directed editors 
are not reachable by standard text editors. Text editors 
can not translate, check syntax, or do anything to ipStiee 
the correctness of a program. What is interesting, however, 
is that a syntax-directed editor may even offer text-editing 
capabilities not found in text editors. For example, the 
structure imposed on text by a syntax-directed editor should 
enable the user to view any text top-down, stopping ata 
particular depth, so that, for example, only section yc 
(but not section contents) are displayed. in fact, Tie 


type of structure serves to provide an automatic table of 
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contents for the document. Another application of user- 
selected viewing iS in the use cf comments in a program. It 
Should be possible to structure comments into levels of 
complexity so that a reader unfamiliar with the source code 
may view only the high-level comments to get a broad view, 
while readers thoroughly familiar with the project can view 
a more specific, detailed set of comments. evar lat lon or 
user-selected viewing is creator-designated security. There 
may be certain items (text, code, etc.) that are not meant 
to be viewed by certain classes of user. These items need 
not be removed from the product, but need only be "filtered" 
from view by not displaying then. 

Syntax-directed editing is appropriate for creating and 
Manipulating anything representable hierarchically. One 
obvious candidate for such editing 1S a hierarchical data- 
base. It is relatively simple to prescribe a "grammar" for 
a hierarchical database, which may be pictured as a collec- 
tion of trees whose roots may have any number of dependents, 
each of which may have any numter of dependents, and so on 
[Ref. 24: p. 67]. Appendix G, for example, lists an SDE 
input grammar that produces a tree of information about an 
organization's training course history given the following 


Meerarchical structure: 


each course has a number, title, description, list of 


prerequisites, and list of current offerings; 
each prerequisite includes a course number and title; 


each offering includes a date offered, location, and 


Format, as well as lists of instructors and students; 


each teacher has a number and name; 


each student has a number, rame and grade [Ref. 24: p. 
280 }. 
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Course: M23 Title: DYNAMICS Descr see 
Prereqs: 
Course #: M19 Title: Gaicurus 
Course #: M16 Title: TRIGONOMETRY 
Offerings: 
Date: 750106 Location: GSLO )eformar a2 
Date: 741104 Location: LUBLIN opmar 
Date: 730813 Location: MADRID Format: F3 
Teachers: 
Emp #: 421633 Name: SHARP, R 
Students: | 
Emp #: 761620 Name: TALLIS, T Grade: B 
Emp #: 183009 Name: GIBBONS, O Grade: A 
#: 102141 Names: BYRD, W Grade: B 


Figure 5.3 SDE Representaticn of Hierarchical Database 
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Pigure 5.4 Hierarchical View of Database 
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Pecdtglenaigsplday ~or a database Created using the "grammar" 
at Appendix G is shown in Figure 5.3. Figure 5.4 contains a 
representation of the database as pictured logically in 
feta 24: p. 2ZOij. Not only docs one note that both figures 
represent the same information, but one can also imagine the 
Similarity in structure between Figure 5.4 and the tree 
constructed by the SDE to represent the database. This 
discussion does not intend to suggest that a syntax-directed 
editor is capable of performing database operations. at 
does show, however, that such editors can represent hierar- 
chically structured data. It is further suggested that such 
editors might serve as suitable editing devices to create 
and modify databases representable in a particular hierar- 


Senical structure. 


D. CONCLUSIONS AND SUGGESTIONS FOR FPURTHER RESEARCH 


The overall objective of this paper has been achieved in 
the SDE: aworking table-driven syntax-directed editor has 
been irplemented, and many lesscns regarding syntax-directed 
editing have been learned through this implementation. He 
has been shown that the SDE can support virtuaily any 
context-free grammar. AS a program, it has been implemented 
to be as portable and flexible as possible through the use 
of variakle command keys, the "TERM" file, and avoidance of 
System-specific calls. In its 1600 lines of Pascal code, it 
performs the functions of both parser and prettyprinter and 
creates, stores, and retrieves files. These accomplishments 
having been noted, the SDE must now be appraised on its 
effectiveness as a useful product. 

The SDE is too slow for actual program development. The 
compromise of the command line and the need to refresh the 
entire screen, rather than only the edited portion, cause an 


unacceptable delay between user input and program response. 
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Simple navigation about the tree is clumsy and unintuitive. 
in short, the user interface needs Significant improvement. 

The SDE also lacks features essential to a viable 
editor. Lack of any search mechanism is a severe linita- 
tion: The absence of a comment input mechanism has also 
been cited. 

As for display, the current SDE still has probiege 
forcing line breaks on long display lines. When editing 
grammars, for example, 1t cccasionally splits terminal 
strings after the opening guote, with the result that pret- 
typrinted forms are perceived to have long strings 
extending over two lines. Another problem is that the 
heuristic movement of the focus above the current node, done 
to increase efficiency and reduce the need to change depth 
and focus manually, causes "tunnel vision" when editing the 
last items in a long series. For example, attempting to 
edit the last terminals and nenterminals. in the rules in 
Appendix G causes only the concerned rule (in fact, some- 
times only a portion of the rule) to be displayed. 

As a prototype, the SDE has done its job in identifying 
these shortcomings and providing lessons from them as well 
as from its successes. Further, the shortcomings cited 
above provide ample subject material for future research. 
It is felt that the most Significant potential for improves 
ment to the SDE lies in its interactivity with the user. 
Direct editing on the CP rather than through a command line, 
accompanied by immediate update of text and menu alike, 
would help make the SDE a viable, useful editor. 

Aside from continuing the development of the program 
itself, the potential of the SDE as it already exists has 
yet to be explored. It has been suggested that the SDE has 
certain translation capabilities as well as the ability to 
create an executable program tree. Much of this abi 


stems from its acceptance of grammar-Specific synthesis 
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parts which allow both permutation of the order of nonter- 
minal children and the inclusion of terminal children. The 
effectiveness of this approach regarding translation and 
direct execution of the program tree Should be researched to 
assess its potential, and further guidelines for grammar 
design should be developed. 

This paper began with a discussion of modern programming 
environments and how syntax-directed editors fit into this 
scheme. As a final research suggestion, it is recommended 
that the SDE be placed into such an environment, accompanied 
by a set of viable grammars and interpretive tools that 
could act directly cn the trees produced by the SDE. eT 
this way the utility of the SDE as part of an overall 


programming environment may be assessed. 
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APPENDIX A 
NOTEWORTHY SDE DATA TYPES 


const alfalen = 10; 
type alfarec = record 
Wrd: adios (Berk cues packed array 
[ 12200") Of Sena 


len: 0..altaven- 
end; 


{Pointer and Record types used in the linked list of 
Grammar Rules: } 


grammar = “grec; 

alist = “*arec; 

beg oe dasa = “taggedarec; 
ier 


LSt. =] tnt bec. 
altlist = “*altrec; 
defnptr = “defnrec; 
affixrec = record 
alt; az, as. ehan- 
end; 


grec = record 
name: alfarec; 
next: ~Giailal ja. 
case iSalternation; boolean of 
tie: ets ia se altiaeen, 
false: (defn: deinptr); 


end; 

altrec = record 
Choice: Char, 
choicedefn;: defnptr; 
choicedisplay: Valrs. 
ext: -aLtirs ce 


end; 
defnrec = record 
avails tpevise: 
Ser taggedalist; 
nRtdict: almer. 
end; 
tntrec = record 
isnt: boolean; 
INEO;; a2lrenec, 
affix: ebriwcec. 
Next: thttrs.- 
end; 
taggedarec = reccrd 
Eee alfamec: 
avlist: alist: 
end; 
aL6C = lLecord 
attr: alfarecs 
Yall: alfamec- 
Va liIsne; boolean; 
aALLIxX: aliixceee 
New sds oer 
pee alist; 
end; 
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fTypes used in the Grammar's Sets: } 


setptr = “setrec; 
setrec = record 
nane: alfarec; 
members: set of char; 
next; setptr; 
end; 


{Pointer and kecord types used in the Program Tree: } 


nodeptr = “prognode ; 
childptr = *childnode; 


prognode = record 
Mame: alfarec; 
syntax: defnptr; 
aSmt: eae ae 


parent; aad : 
Chavelist:.«c coer 


end; 
childnode = record 
Peeled in dis cic. 


Gide sOde — tl; 
next: Chadd ptr; 


{Record type that maintains the Current Position 
in the tree: 


currloc = record 
Ci nodeptr: 
Bes Callvaptr. 
end; 


{Types used in the linked list of Legai Menu Choices: } 


choiceptr = “choicerec; 
choicerec = record 
choice: char; 
mescr; alta: 
next; choliceftr; 
end; 
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APPENDIX B 
DESCRIPTION OF INPUT GRAMMARS TO THE SDE 


The following paragraphs describe the format of a 
grammar Suitable for input to the SDE. At the end of the 
appendix is the input grammar used to generate other gran- 
mars, which may be considered "programs" in the "lLanguaye" 
of grammars. Because the listing 1S Suitable for input to 
the SDE, it both demonstrates and restates the prose 
description below. 

Certain semantic details should be noted at the outset. 
All strings described below must be no more than 10 charac- 
ters in length. This is not checked by the grammar- 
generatiny yrammar. A string of more than 10 characters 
Will cause the SDE to print a warning message during 
initialization, and the SDE will truncate the string ome 
characters. Other special considerations involve the use of 
double quotes and double periods in place of single charac- 
ters, as described below. Further semantic reguirements are 
listed in Appendix C. Note also that, except where speci- 
fied otherwise, items in a grammar are separated by a space. 

An input grammar iS a sequence of one or more rules 
foliowed by a sequence of zero or more sets. The sets are 
separated from the rules by a colon {":") on a Se@panpage 
line. The first rule must be a regular rule with a non-null 
Synthesis part, as described below. 

A rule is either a "regular" rule or an "alternation." 
A regular rule is composed of a name, an analysis part anda 
synthesis part. The name is a string (of up to 10 charac- 
ters) and is followed by a period (".") to delimit Gtgeen 
the rest of the rule. The analysis and synthesis parts are 


each enclosed by a set of parentheses. 
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Alternation rules are composed of a name and one or nore 
choices. The name is separated from the rest of the rule by 
a period followed by an "A" to distinguish the alternation 
from a regular rule. Further, the entire set of choices is 
enclosed ina set of parentheses. Each choice is composed 
of a single character (used to select the choice), analysis 
and synthesis parts, and a display string describing this 
choice in the menu. As above, the analysis and synthesis 
parts are each enclosed by their own set of parentheses. 
The choice character and display string are enclosed by 
Guotes. 

An analysis part 1s a sequence of terminals and nonter- 
minals. Terminals are strings delimited by quotes. i fa 
quote is to be part of a terminal, it is represented by two 
consecutive quotes, i.e., "". Nonterminals are names delin- 
ited by parentheses and followed by an affix as described 
below. 

Synthesis parts consist of a node name followed by zero 
Or more children. Each child is a path name and a child- 
mode, which are separated from each other by a period. A 
childnode is either a terminal or nonterminal as described 
above. Note that the node name ina synthesis part is 
always followed by a space. Thus, even if the node has no 
children, aspace will separate the name of the node from 
the closing parenthesis of the Synthesis part. Note also 
that synthesis parts are entireiy optional in any rule 
Eexeept the very first in the grammar. However, the opening 
and closing parentheses of the synthesis part must always be 
present. 


An affix on a nonterminal in the analysis and synthesis 


parts of a rule may consist of up to three characters. The 
mae S + Snacacter may be a mE Nee Aap ts wf La tiekel ies OL 
emyeof the 10 digits. hemencweitnpactelalSra digit or "'" 


Symbol, a second character is included which may be any of 
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the above symbols (excluding the digqgitseange Also, i 
the first character (or second, if the first waS a digits 
“rn, is atl", a second (or third) character is added.) ime. 
character may be any frintable character at all. 

A set consists of a name and a string of characters. 
The name is separated from the string by a period. Note 
that this string is not limited to 10 characters in lengua 
in fact, it may extend over several lines. The string is 
ended by a double period. (A single period means the symbol 
we" as to be added to the string.) 

The following is a grammar demonstrating the above 
syntax. It is also the grammar used by the SDE to create 
"programs" in this syntax, and is therefore self-describing 
as well as suitable for input tc the SDE: 


grammar. ((rule)+ "Z%:" (set) * ) 
(gram r.(rule)+ s. (Set) * ) 


See (lage (name) & iP t anal & ae (i 7 n J) t mn 
(re n. (name) & "a. (anal}é es (sant, | t require 
go aie (name)& ".A(! ees + 
a 


n. (name) & Ch oneee ) 2 renee 


name. ((char) + } 
(name c. (Cnhar)+ ) 


choice, disp ie r)é TUTE UT Te (anal) 5 aN \ cas (syn) ? ") eevee 


views ee 


(Riesees Ee erties & a. Canaunte s.(syno)? d.(display) & ) 


Abed 1 at = t ue it 
ae nea eee Ee 


heey 
t+ tt Te te erat (name) & PL ee | 
(term SOE LnA ) "Lerninae 
Wy (? cy (name) & 1 tt (artis & wo 
(nterm n. (name)& a. (affix)s& "nontern") 


See see Hot 2 Ghee ) ; 
synpart nOd aN HSaeTE CG. (Chad) aa 


oa cmenel ) 
Ce cba tat a (ene 

aac. (path) & ch. (cnode) & ) 
a a ) 


SO ae ) 


ie Xe A 


tem gle 
(& MsiIn glen t" 

agen tty tt 
+1 "Kleene +" 

tf ke tf (! on 
* eee sat 

Wo (' om 
71 reer2ye) 


men a ae & ) 
eee ine aa Ee “prige" 
nq ( primee a. eas 
t 


rine ao Tattle (a ele ee ee age! 
re ( (char) } 
(list) Se Eee Lest!) 
pe 1x 2. A 
Wem OS 
(& 2 Sime nt 
Wt iby et 
+2 "Kleene +!" 
tx tt (! “ 
(*2 "Kleene *" 
wo (! ott 
(?2 ae 
- t (' 1? (char 


(li Stz Cc) « (char) & eee es) 


display. ((char 
ads athay cs ec. (char) + ) 


set. eee (name) & wo (Char) + th eer) 
et n.(name)& c. Peace ) 


yZABCDEFGHIJKLMNOPQRSTUVWX YZ 


eat a apedet ahs ey tat ie ja#$.— Saye 


1234567890 ,.<>/? 
digit.0123456789.. 


APPENDIX C 
SEMANTIC RESTRICTIONS OF GRAMMAR DESIGN 


The following rules apply to grammars to be used as 


INDE et Ortne —SDE- The syntax of such grammars may be found 


in Appendix B. 


7) 


2) 


For a given regular rule or alternation choice: 


a) every nonterminal in the analysis part must also 
appear as a child in the synthesis part, and vice versa. 
The nonterminals must match identically, including their 
affixes. (The only excepticn to this rule is the "iden- 
tity," which consists of exactly one nonterminal in the 
analysis part, and no synthesis part at all.) The order 
of the nonterminals in the analysis part is independent 


of their order in the synthesis part; 


b) the synthesis part may have additional terminal chil- 
dren, and the analysis part may have additional terminal 


strings -- these are not related to each other at all; 


c) the path name of each child for a given synthesis 


part must be unigue; 


d) each nonterminal in the analysis part (and synthesis 
part) must be unique. If more than one appears, they 
are to be distinguished by their affixes: the first 
affix of all but one (or all of them, if preferred) will 
be a unigue symbol selected from the set of digits and 


tem Oma temn tt s20) - 
Across the entire grammar: 
a) every cule and set name gust be unigue; 
b) the tag name of every synthesis part must be unique; 


Pe 


6) 


C) every nonterminal name in analysis and synthesis 


parts must either be a rule name or a set nane. 


Every non-identity regular ruie must have at least one 


nonterminal in its analysis and synthesis parts. 


The first rule in the grammar must be a non-identity 


megqular cule. 


5) Individual choices of an alternation nay: 


a) be similar to non-identity regular rules in that 


their synthesis parts may include a tag and at least one 


nonterminal son; 


b) have synthesis parts that have only a tag (and no 


Children); 


c) have synthesis parts with tags and terminal children 


only; 


d) be identities that reference rules as described ina, 
b, or c above. (Note that the end rules would meet the 
"regular rule" syntax -- but those in b and c above must 
not be referenced as regular rules elsewhere in the 
grammar, since they violate "regular rule" semantics.) 
Note that identities may also reference other identities 
to form a "chain" leading tc a rule as described in this 


section. 


Choices of alternations, therefore, must NOT be sets, 


other alternations, or identities leading to sets or alter- 


nations. 


APPENDIX D 
INPUT GRAMMAR FOR EXAMPLE IN CHAPTER 3 


Le. ( (A) & 
Pensa neil a) (A)S ) 


& ) "a ‘a T" 
wo A & +4 a Ae 
a2 opnd1. (sj opnd2. (T)& oprtr."add" ) “A=—-> \Aueueee 
& 
Py 


A 
dt ( F 
~(F)§ ) "T--> FN" 


Won de Ww * «6 4 
t2 opnd1. (T)& cpnd2. (F)& oprtr."mpy" ) “T==> pee 


ah ee 


A 
4 EtG. (ghar) é te S=> chow 


wom en a We my i 
(2 ~({A)& ) he Soa) 


Gial. ars... 


APPENDIX E 
MINIGOL GRAMMAR 


eek, eh Pegl a" ee sii (Statement).; "%end!"™ ) 
(block head ecl)* body.(statement).; 


d i mo\r + S dys wi 
ee ec} n Seay Bec ele ‘ 


a Roi (inte eb! 
(1 "integer" 
typi (' real t 
(real ) “real") 


Statement.A {( 
"a" ( (assign) & ) 
).“aSsignment" 
my tt ( whileloop & 


mere oat 
tbh" (( block) & ) 
Dao ae 
"i" ((ifstat) 1 
ee diaie a mnt") 
Meeaegn. ("%\" (var)s& ": exp)& "i" 
ta ae Ane S. (ONE PI 


ifstat. ("A\iF " (relation & " then" (statement) & 
(elsep bare arene 
(2= cond. (rela $on) § conseg.(statement)& 
alt. (elsepart)? 


elsepart.("Z%else" (statement)é& ) 
pea ise s. (statement) & ) 


Bee eyes te ("A\while " (relation)& " do" (statement)&S "!" ) 
while cond. (relation)& body. (statement) & ) 


lation. & i 
- peti relopye 4. (exppe 2. (exp) 's ) 


relop.4 cin | — he 
Wht ae 3) hi a z 
te it oe : ), not 
. i Zz i 
wy av? q 
ig‘ As >= 2 any 
(ge) "2=") 
oer )& " + 0 (term n) & 
Aa ls (€Xp )& 2-(term)& ) “exp + trom" 
fe . 1f ( ot & t t term) eS 
Wee) 2 2.(term) & CexXp  — “bE” 
ett | Senne ELA cLOr \G 


ae 


mpeg ferayg 7 H (eactorye | ) “CEA fetrm: 
ern accor 
vi i. (tera) ¢ 2. (factor)s }) "tri weer 
fre. See Seen "(ex a) 
Et ee é 
"number" 


a Vota) ab Tem 


mse ae orn ) Poe Faso) ) “RE terete 
er i ac 
le ae erm) J*3 EAC HORNE ) “trn /epermn 
“ACG ee ooo 
sank 


Wet ( nuaber) ¢ i 
Bue er 
Ny lt var) 
Pye aeRoest) 


t 
wn noe saccron ke 
"(exp)" 


y Ve) }, (exp) Ww 
etd faugedeie 

Bua 
oye var) & 

wyartablet) 


Val. rer ) 
ay 
ie Ae eas ' 


number. ( (di ith 
(num va fai @ueye ) 


char.abcdefghijklmnopgrstuvwxyz... 
diGa t20 125456739 ..4 


DESGRIPTION OF THES TERM” FILE 


ier "Ten stele “required by the SDE during initializa- 
tion is a text file consisting ci 5 lines of integers. The 
first four lines in the file contain information about a 
particular terminal's commands to activate and inactivate 
inverse video, clear the screen and move the cursor to the 
beginning of the SDE menu display on the screen. The fiith 
line contains dimensional information about the display 
area. 

The information in the first four lines follows the same 
pattern. The first integer on eaca line represents’ the 
humber of integers that follow on that line. (This tells 
the SDE how many integers are to be read on the line. Since 
the SDE reads the information as integers, it can not detect 
ends of lines without this knowledge.) The second integer 
tells the SDE how many spaces on the screen will be physi- 
cally occupied by the rest of the sequence on that line. 
The remaining integers represent a seyuence of characters 
that will cause the terminal to perform as desired. 
Conversion of the integers to characters is done using the 
Baseal “chr’® function; when using the ASCII character set, 
therefore, the number "65" represents the character "A", 
while the number "17" represents the character "control-W". 

The purpose of the second integer on each of the first 
four lines is to inform the unparser how many columns to 
count when invoking the sequence. Usually, eet y aierieina 
inverse video causes no spaces to be used on the screen for 
those terminals that support it. However, for terminals 
that do not Support inverse video, a printable sequence of 


characters must be used instead. These characters will take 


up space on the display, and the SDE must account for these 
Fositions to determine where to ‘force carriage returns on 
Lomd lines. 

Note that the SDE uses these integer strings as "black 
boxes" to achieve desired effects. This not only achieves 
terminal independence, but it also allows the user sone 
freedom in customizing his display. For example, Figure 4.1 
Showed one means of displaying the CP on terminals without 
inverse video. The user could easily have used "<<<" and 
">>>" instead of "-3}" and "f{-" by modifying the fits 
second lines of the "TERH" file. Note, however, that it is 
a good idea to include at least a single space (ASCII number 
BZ) in the "inverse off" sequence, especially on those 
terminals that do support inverse video. This is because 
the CP may reference a "nil" necde, which has no display of 
its own. Activating and inactivating inverse video in 
Succession (without printing a space in between these 
actions) will have no visible effect at ali on most termi- 
nais, and the CP will seem to disappear. 

The fourth line of integers moves the cursor to the 
beginning of the menu display on the screen. On a 24-line 
screen, this should be the eighteenth line, first coiumn. 
This position affords a sufficient amount of space for the 
menu while leaving most of the screen for the progran 
display. However, in grammars with very long alternation 
lists, or on a screen where only three commands wiil fit on 
a single line, the menu must tCegin higher on the screen to 
prevent scroliing. Customization of the menu for the 
terminal and particular grammar in use iS a refining 
process. 

The final line of the "TERM" file provides information 
on the dimensions of the screen. Like the other lines, the 
first integer tells the SDE how many integers follow; 


however, the line contains no integer describing how many 


118 


Spaces are used by the rest of the line. Rather, the second 
integer represents the number of lines on the display 
(usually 24 or 16); the third integer contains the number of 
columns (usually 80 cr 64); and the fourth integer states 
the line on which the menu commences. Note that this last 
integer is required because this information is not always 
obvious from the sequence on the fourth line -- each 
terminal has its own algorithm for moving the cursor. 

The following figure represents the "TERM" file for the 
VT100 terminal. Brackets have been added to include con- 
Ments, which would not appear at all in the actual "TERM" 
ie 2. 


| ? 2 equence to activate inverse video} 


= | 
9 {turns lnverse video off. Note 
{ 
| 
ad 


Ome Fie 10 4 
ieee! St 88 7 
Enc itieand che "S52" for a space} 
pmeom2? 91 59-72 27 91 74 eeu the screen} 
70 27 91 4S 57 59 ae i2 74 {moves cursor to 
{18 ee aaa Tst colunn} 
3 24 £0 18 {screen data} 


Figure F.1 'Term' File for VT100 Terminal 
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APPENDIX G 
SAMPLE GRAAMAR FOR A DATABASE APPLICATION 


sa aad ar ) 
db c ae ; 


course. ("%Course #: tere " Title: “(titers 
escr; " (letter) + "ma\Prereqs:" "\" (preregq)= 
mig WOfEer in sot eerie S)* wiige 
(eee ls eee a i (‘titlge s rings) # 3.(prereg) * 
. (offerings) * 
preregq. ("S" "Course#: crsnunj)& " Title: “ (titie)e 
(preregq 1. (ene anene « (tp €) oa) 
crsnum. ({letter) & (digit) & ay Ve 
ee is see ete ys (deg 2th 5 (di gaten ve ) 
eee (eww: ) 
(title etter)+ ) 
offerings. ("ADate: “ (digase) +s! e oCGat One 
(Letter) tau FOP acecu. " "  Plet pem rs igi Lt) & 
m\Teacners"™ "s\" (teacher) * TTastuden ts 1 oN 


(Student) * "rie 
Kearse ie (ne it)+ 2. (letter) + 3. (letter) & 4. (digit) & 
(teacher)* 6 - (Student) * ) 


teacher. ("&" he rsuaee ) 
(teacher (prsndat)é ) 


prsndat. ("Empé ; (ara me - : " (letter) aa) 
(erendat 4 Sees (fetta ) 


student. ("4" Peres ut ade: (letter)& ) 
(student (prsndatia. 2 > TLeeeeae ) 


letter. ABCDEFGHIJKLMNOPORSTUVWXYZ., «. 
digit.1234567890.. 
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APPENDIX 
ALTERNATE GRAMMAR FOR GENERATING GRAMMARS 


The following grammar represents Argot, a grammar- 
fescribing format. PeelcaSit came OL Input to tne SDE and 
Subsequent use aS a graMmar-generating grammar by users who 


prefer its format over the syntax of the grammar in Appendix 


be 
grammar.((rule)+ "2%sets:" (set)* ) 
(gram r.(rule)+ s.(set)* ) 
fe) Af 
ei (! oon (name) & 1 "! (ana al & "Wy => 1 (Syn) ? mon 
(re N. (name) & a. (anal) g5 GY a regular" 
Mam ("eZn (name)& ": {" (choice).;° "}" 
(ait n.(name)& “c. (choice) .? )} “altrnation") 


name. ( (char) + 1 
(name aie ar)+ ) 


choice. ("&\" (Char)& "cm (display)& "): " (anal) & "%\\=> 


Ss 
fonedice c. (charié deamdi) 6 Stee ds. (display) € ) 


eo (tnt 
ane" (anal List. (tnt) + ) 


aaa 
ty u reetes (name) & teeete one 
(term hot na ean ) Sere 
to (! <i (name) & 7 (af fi950) & 9 O98 
(nterm n. (name)& a. (affix)& ) "nontern") 


ge | nodes ete teaaey *) 
syiraetenede. (node) >o ¢. (child) * ) 


mae name) & ) 


ie 14. th 5 Ww 
chi 2 {Pa ae cae eoHe) & ) 


path, ((nane) ¢ ) 
fee te ) 


coon x.A ( 
te "? ag 7 
{ tno affix" 
eae (ity tt 
fat 
Woe che 
tte af 
wot can 
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(21 yeu 
men (ite (afzix2 ye) | 

shar} e” (atfix2)& ) "Deane 
ee Cre Nao ihe gt 

1ist ¢.(char)s ) "Ips 


affix A I 
tte it fi 
( wo: afr 
Wet tan 
(+ +2 thy tt 
wan (en 
we) trot 
won ct on 


a0 ton 
ren {eae 8 u ) 
list cG (cl Pa ) Tivses) 


d : 
1sP4az. aes 


2 i ou & tt it h 7 tt tt 
“7 i ae Rey aera ne CG; ; (has) ae 


oN onon Roig WiyZabederghi jk T Rep 
O90 a7 a = t= IN 1s 


As stated, the above grammar is suitable for designing 
grammars ina different format from that of Appendix B. An 
example of the format described above is the following, 
which is the Same grammar in the new format. 


grammar: <rule>+ "%%sets:" <setd* 
=> gram; r=<rule>+ s=<set>* = 


rule; 
¢ (regular) : ae <name> "; ™ <anal> "KK => " <Sya> 72a 
=> re =<name> a=<anal> s=<syn>? ; 
a(altrnat ion) : "ZZ" <name> ": {" <choice>;... "}" 
=> alt: n=<nane> ¢= <choice>;... 


hame: <char>+ 
=> name: c=<char>+ 


,schar> "(" <display> "): " <anal> “AN =e 
c=<char> a=<anal> s=<syn>? d=<display> . 


Ghoice+ UN" 
<sSia 2 ee 


=> cholc 


Ana GeneD + 
=> anal: Llist=<tnt>+. 


thoes 
& (ee erminal): terete rt <pDame> trreee otf 
=> term: n=<name> : 
n(nonterm): “<" <name> ">" <altis oe 
=> nterm: n=<name> a=<affix> } 
SY <node> est: -"' coh ta = 


ne 
> synpart: node=<node> c=<child>* . 
de: <name> 
> 


1d 


Wee 


Creare “pacn> =" <cnode> 
=> child: p=<path> cn=<cnode> . 


Bees <name> 


Buoce: <tt> 


eee x sf 
6 (no_ affix): oe 


ees 
x (): a ; 


2 ee) - ND 
2 e 
t (prine) : win <affix2> 
=> prime: 2= <atfix2> : 
mi@eistjs <Char>  ™s.. 
=> jist: c=<char> } 


eee] x 2 ¢ d 
& (no_ < Se Ne Bk 
+ (+ aah , 
Nz, op 
* (*) 2 tt 
=> *2:; 
mae?) «: Wit 


= ier: 
me LASt) ; ae Serene 
=> list: c=<char> } 


eae oie <char>+ 
> dsply: c=<char>+ . 


fear a” <name> “ = {" <char>,... "3" 
=> set: n=<name> c=<char>,... . 


Secs 2 
Char = oe eo ee RSTUVWKYZabcdefghijklmanopgqrstuvwxyz 
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MINIGOL GRAAMAR MODIFIED FCR PASCAL COMPATIBILITY 


The following is a modification of the "Minigol" gqifamman 
presented in Appendix E. The purpose of the nodification is 
to make minigol ccmpatible witn the Pascal grammar in 
Appendix J. see Section 5.C for the implications Gime 


compatibility. 
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PASCAL SUBSET GRAMMAR 
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